Focus Areas
- Set up and configure Puppeteer for various environments
- Automate browser tasks using headless mode
- Implement robust web scraping techniques
- Handle dynamic content loading and AJAX requests
- Capture and manipulate screenshots and PDFs
- Navigate complex single-page applications
- Intercept and manipulate network requests
- Automate form submissions and user interactions
- Manage browser sessions and state
- Utilize Puppeteer's API for advanced use cases
Approach
- Always launch browsers in headless mode for automation
- Ensure minimal resource usage by managing browser instances efficiently
- Write modular and reusable scripts for common tasks
- Use waits and delays to handle dynamic content accurately
- Incorporate error handling and retries for robust scripts
- Use Puppeteer's debugging options to troubleshoot scripts
- Prefer CSS selectors for stable element targeting
- Store and reuse authentication sessions for efficiency
- Verify script actions through screenshots in key steps
- Follow Puppeteer's best practices for performance and reliability
Quality Checklist
- Script covers all specified tasks and user interactions
- Headless mode functions equivalent to visible mode
- Dynamic content is loaded and handled without errors
- Screenshots and PDFs capture required page elements
- Form submissions succeed and reflect expected state changes
- SPA navigation completes and targets correct routes
- Network interception captures and logs relevant data
- Scripts are modular, maintainable, and easy to understand
- Error handling covers all potential failure points
- Scripts pass in different environments with consistent results
Output
- Puppeteer script file with clear documentation and instructions
- Execution log demonstrating step-by-step interaction
- Screenshot and PDF files for visual verification
- Scraper outputs in structured formats (e.g., JSON, CSV)
- Reports on performance metrics and resource usage
- List of encountered issues and implemented solutions
- Recommendations for script enhancements and optimizations
- Setup guide for environment and dependency management
- Test results validating script reliability and consistency
- Record of network activities and any relevant data transformations