Sorry, can you explain how you check trend? Besides programming language, how do you want to check a trend?? Comparing news title field with what??
Thanks iboinas for reply, my idea Comparing news title example: I have 10 news title so think compare news title 1 with news tilte 2 etc....
You need to generate a list of keywords for each post using the title. Take a look at the list of stop words in English. https://gist.github.com/brianteachman/4522951
From each title normalize the string and omit the stop words and you will be left with the tags (most probably). After that step i think it's quite straight forward. You need to keep a table of posts and tags, then query the tags count for the last 24 hour and sort them by count descending. The most used tag should come first.
Hmmm so you want to compare every news title with each other!
So my bet advice is to go via Collection funtions, to work with processor/memmory instead of database.
For not pulling 2000+ items to memory, use method chuck to process smaller chuncks each time. Example as per documentation,
Flight::chunk(200, function ($flights) {
foreach ($flights as $flight) {
//code here
}
});
It means it would process all Flight Models, but 200 at a time...
So, right now, you want to count the repetitions of words in title, and get the top 5 words in trend??
I think this code Points you in right direction
Please pretend SparePartsOrder = News (model) and client_reference = title
$words_in_news = collect();
//Get all News, within dates, processing from database at each 200, not to use lots of memory of large resultset
$orders = SparePartsOrder::where('id','>',4)->chunk(200, function($orders) use (&$words_in_news){
foreach($orders as $order)
{
$words_in_this_new = explode(' ',$order->client_reference);
foreach( $words_in_this_new as $word)
{
//if word doenst exist in collection, adding with count = 1
if (!$words_in_news->has($word)) $words_in_news->put($word,1);
//if word exists in collcetion, increment its count
else $words_in_news->put($word, $words_in_news->get($word) + 1);
}
}
});
$ammount_or_top_words = 5;
//remove whitespace key "" from words, sort by desc, to get top used words, and then get index 0 until your desired $ammount of top words
$words_in_news = $words_in_news->forget("")->sortByDesc(function ($count, $word) {
return $count;
})->splice(0,$ammount_or_top_words);
//There you go, a collection with the top 5 trending words
dd($words_in_news);
Of course, this code i tested within a project of mine, so, untested, consider
$words_in_news = collect();
$news = News::where('date','=', \Carbon\Carbon::today() )->chunk(200, function($news) use (&$words_in_news){
foreach($news as $new)
{
$words_in_this_new = explode(' ',$new->title);
foreach( $words_in_this_new as $word)
{
//if word doenst exist in collection, adding with count = 1
if (!$words_in_news->has($word)) $words_in_news->put($word,1);
//if word exists in collcetion, increment its count
else $words_in_news->put($word, $words_in_news->get($word) + 1);
}
}
});
$ammount_or_top_words = 5;
//remove whitespace key "" from words, sort by desc, to get top used words, and then get index 0 until your desired $ammount of top words
$words_in_news = $words_in_news->forget("")->sortByDesc(function ($count, $word) {
return $count;
})->splice(0,$ammount_or_top_words);
//There you go, a collection with the top 5 trending words
dd($words_in_news);
Bear in mind, that chunk method enters in a Closure, so we usr &$words_in_news with &, that gives the memmory reference, so we can save data outside the scope of the closure...
Consider making a classe Service, or Job for this, and run it via Larravel scheduling, or cronjob etc, that updates a table for this purpose, etc
I guess this points you in a very clean, quick, low memmmory, and low database usage, for large sets even if 10.000, 50.000 news, would be effective.
What @astroanu suggested is also good, removing the $stopwords
So, before selecting the top X words, remove the results that are in stopwords
$stop_words = array(
'a',
'about',
'above',
'after',
'...1000 words....',
'yourself',
'yourselves',
'zero'
);
$words_in_news = $words_in_news->forget("")->forget($stop_words)->sortByDesc(function ($count, $word) {
return $count;
})->splice(0,$ammount_or_top_words);
//There you go, a collection with the top 5 trending words
dd($words_in_news);
Now you have top 5 trending words, that are not in array $stop_words
I think you are a happy programmer today?? :)
Love the Laravel Collections, arrays on superpowers, it's one of the most amazing things that this amazing framework has to offer!!
In fact, there are so many good stuff about the framework, it blows my mind away!!
Congratulations to Taylor Otwell, and all contributors, and all community that allows to keep growing!!
Happy new year all!!
You can have stopwords in a table, and manage it that way, then retrieve them also from DB...
You can do so many stuff!! =)
Keep in mind ->Chunk to process iterative steps of large data ->Collection methods to work data, and get new collections, that will have again methods available
This avoids having separate tables etc, loads of queries, management processes etc..
This is key points for simplicity and performance, Laravel magic is what it takes!!
As you could see, with the code i providede, you just need the news to exist, that's it! :)
very thanks astroanu and very thanks iboinas We will use the solutions offered
Sign in to participate in this thread!
The Laravel portal for problem solving, knowledge sharing and community building.
The community