Support the ongoing development of Laravel.io →
Queues Architecture
Last updated 2 years ago.
0

I'm not sure you need queues to do this

I would set a "last_scraped_at" timestamp against each url you scrape

Then in your cron:

  • select ~100 urls and order by last_scraped_at ASC
  • scrape and process each url

This way you will always process the oldest url first - you could add a priority field and sort by this first to allow you to always process certain urls

On cron jobs like this where you do not know how long it might take to process I have found lock files to pretty useful to prevent multiple instances running at the same time. Just check for the lock file at when the job starts, if it is locked quit out otherwise, lock the file and run your script (remember to unlock the file at the end :))

http://php.net/manual/en/function.flock.php

Last updated 2 years ago.
0

Sign in to participate in this thread!

Eventy

Your banner here too?

RedQueen4 redqueen4 Joined 6 Nov 2014

Moderators

We'd like to thank these amazing companies for supporting us

Your logo here?

Laravel.io

The Laravel portal for problem solving, knowledge sharing and community building.

© 2024 Laravel.io - All rights reserved.