To better understand how you can take advantage of the data scraping functionality, let’s create an automation that extracts some specific information from Amazon.
It is recommended to run your web automations on Internet Explorer 11 and above, Mozilla Firefox 50 or above, or the latest version of Google Chrome.
Let’s say you are a sports gear vendor and are interested in finding out the latest prices for volleyball balls on Amazon. You can do the following:
- Open Internet Explorer and navigate to www.amazon.com.
- In the search box type "volleyball ball" and press Enter. Results are displayed in the web page.
- In Studio, on the Design tab, in the Wizards group, click Data Scraping. The Extract Wizard is displayed.
- Following the wizard, select the first and last items in the web page. The Configure Columns wizard step is displayed.
- Select the Extract URL check box.
- Change the name of the column headers.
- Click Next. A preview of the data is displayed and the fields you selected are highlighted in the web browser.
- Click the Extract Correlated Data button. The Extract Wizard starts again.
- Following the wizard again indicate the prices of the items. You get to the Configure Columns step.
- Change the name of the new column, and click Next. The data preview is displayed.
- (Optionally) Change the order of the columns by dragging them in place.
- Click Finish. The Indicate Next Link window is displayed prompting you to indicate the Next button if the spans more than one page.
- Click Yes and select the Next Page button in Amazon. The project is saved and displayed in the Designer panel. Note that a data table variable, ExtractDataTable, has been automatically generated.
- Drag an Excel Application Scope activity under the Data Scraping container.
Install the Excel activities package using the Manage Packager to have access to these activities.
- In the Properties panel, in the WorkbookPath field, type the file path of an existing Excel file to which you want to write the data.
- In the Variables panel, change the scope of the automatically generated data table variable to Sequence.
- In the Excel Application Scope, drag a Write Range activity.
- In the Properties panel, in the DataTable field, add the ExtractDataTable variable. The final project should look as in the following screenshot.
- Press F5. The automation is executed.
- Open the Excel file you used at step 15. Note that all columns are populated correctly.