No recent searches
Popular Articles
Sorry! nothing found for
Posted over 1 year ago by Aaron McGarvey
I have a bunch of spiders running on Scrapy Cloud on a periodic basis. I need to be able to write the scraped data to multiple locations.
I am able to do so when I run the spiders on my local machine using the FEEDS variable that I set in custom settings:
custom_settings= { "FEEDS": {f"s3://systems_data/systems_sample_results_page/FILE1.jsonl": {"format": "jsonlines"}, f"s3://systems_data/systems_historical_results/FILE2.jsonl": {"format": "jsonlines"} } }
In Scrapy cloud, there is a custom setting for FEED_URI but not FEEDS. As far as I can tell, this only allows writing to one location.
How do I write scraped data to multiple locations in Scrapy cloud?
Locally, I am using Mac OSx, Scrapy 2.9.0 Python 3.8.8
0 Votes
Adrian Chaves posted 11 months ago Admin Best Answer
You can use FEEDS in Scrapy Cloud as well, it allows arbitrary settings, not only the listed ones.
To enter an arbitrary setting name, select the first entry in the drop-down list, “Custom Name”.
You can also use the “Raw Settings” tab to edit all your settings as plain text, which is sometimes easier.
1 Comments
Adrian Chaves posted 11 months ago Admin Answer
Login to post a comment
People who like this
This post will be deleted permanently. Are you sure?
We use cookies to try and give you a better experience in Freshdesk.
You can learn more about what kind of cookies we use, why, and how from our Privacy Policy. If you hate cookies, or are just on a diet, you can disable them altogether too. Just note that the Freshdesk service is pretty big on some cookies (we love the choco-chip ones), and some portions of Freshdesk may not work properly if you disable cookies.
We’ll also assume you agree to the way we use cookies and are ok with it as described in our Privacy Policy, unless you choose to disable them altogether through your browser.
I have a bunch of spiders running on Scrapy Cloud on a periodic basis. I need to be able to write the scraped data to multiple locations.
I am able to do so when I run the spiders on my local machine using the FEEDS variable that I set in custom settings:
In Scrapy cloud, there is a custom setting for FEED_URI but not FEEDS. As far as I can tell, this only allows writing to one location.
How do I write scraped data to multiple locations in Scrapy cloud?
Locally, I am using Mac OSx, Scrapy 2.9.0 Python 3.8.8
0 Votes
Adrian Chaves posted 11 months ago Admin Best Answer
You can use FEEDS in Scrapy Cloud as well, it allows arbitrary settings, not only the listed ones.
To enter an arbitrary setting name, select the first entry in the drop-down list, “Custom Name”.
You can also use the “Raw Settings” tab to edit all your settings as plain text, which is sometimes easier.
0 Votes
1 Comments
Adrian Chaves posted 11 months ago Admin Answer
You can use FEEDS in Scrapy Cloud as well, it allows arbitrary settings, not only the listed ones.
To enter an arbitrary setting name, select the first entry in the drop-down list, “Custom Name”.
You can also use the “Raw Settings” tab to edit all your settings as plain text, which is sometimes easier.
0 Votes
Login to post a comment