I'm curious as well, not only for the security aspect, but for the negative SEO impact that all of the possible duplicate url's can/will cause.
I just recently set up an L5 sandbox and it appears the default L5 installation allows for basically anything before the index.php and anything after the index.php to resolve to an existing controller/view instead of resulting in 404, which is what I think most would expect and prefer.
The above posters example shows this partially.
http://laravel.com/ANY-UNDEFINED-KEYWORD/index.php/docs/5.0 (Typically you would expect this to trigger a 404 error, and not actually route to the index view of the docs/5.0 controller.)
Coincidentally though, it appears that using the above laravel.com site as an example, the home index controller handles this more appropriately. (kind of)
http://laravel.com/ANY-UNDEFINED-KEYWORD/index.php (does in fact kick back a 404 as expected, albeit, not very user friendly, as its just an unstyled error page)
http://laravel.com/ANY-UNDEFINED-KEYWORD/index.php/ (with the slash, however, resolves just fine to the home index controller/view)
http://laravel.com/ANY-UNDEFINED-KEYWORD/index.php/ANY-UNDEFINED-KEYWORD (does in fact kick back a 404 as expected, and with an appropriately styled page even! This is the behavior I think devs would generally expect/prefer)
I also tested
http://laravel.com/ANY-UNDEFINED-KEYWORD/index.php/docs/5.0/ANY-UNDEFINED-KEYWORD/
(which again, typically i think most would expect a 404 result here, but instead we're 302 redirected back to http://laravel.com/ANY-UNDEFINED-KEYWORD/index.php/docs/5.0/ -- This again is really unexpected behavior and generally bad seo practice, the 302 redirect is confusing, telling search engines the requested page exists, but moved temporarily to the url you are redirected to, allowing both to be indexed, and to top that off the url you're redirected to, still contains the undefined keyword, which typically you would not want indexed at all.)
Now the above behavior is not exactly "out of the box" in an unmodified L5 installation. In my own installation I tested the same url variations.
http://somedomain.com/ANY-UNDEFINED-KEYWORD/index.php -- Resolves with 200 ok, to welcome controller/view
http://somedomain.com/ANY-UNDEFINED-KEYWORD/index.php/ -- 301 redirect back to http://somedomain.com/ANY-UNDEFINED-KEYWORD/index.php (without the slash and still containing the undefined keyword, resolving with 200 ok to welcome controller/view)
http://somedomain.com/ANY-UNDEFINED-KEYWORD/index.php/ANY-UNDEFINED-KEYWORD -- 404's as expected (yay this is correct behaviour!)
http://somedomain.com/ANY-UNDEFINED-KEYWORD/index.php/ANY-UNDEFINED-KEYWORD/ -- 301 redirects back to http://somedomain.com/ANY-UNDEFINED-KEYWORD/index.php/ANY-UNDEFINED-KEYWORD without the slash and then 404's as expected (yay this is correct behaviour! the initial 301 redirect probably not preferred, but the end result of 404 negates this.)
http://somedomain.com/ANY-UNDEFINED-KEYWORD/index.php/auth/login -- Resolves with 200 ok, to auth/login controller/view
http://somedomain.com/ANY-UNDEFINED-KEYWORD/index.php/auth/login -- 301 redirects back to no slash version then resolves with 200 ok, to auth/login controller/view
http://somedomain.com/ANY-UNDEFINED-KEYWORD/index.php/auth/login/ANY-UNDEFINED-KEYWORD -- Resolves with 200 ok, to auth/login controller/view
http://somedomain.com/ANY-UNDEFINED-KEYWORD/index.php/auth/login/ANY-UNDEFINED-KEYWORD/ -- 301 redirects back to no slash version then resolves with 200 ok, to auth/login controller/view
I would love some input from other users as to how they handle these duplicate content issues in your routes and controllers. Most CMS's are horrible with how they handle this kind of logic out of the box, and I see lots of people simply resort to injecting canonical tags on all pages to at least help with search engines not indexing all the inappropriate variations, but I rarely see anyone solve it eloquently such that these other variations properly 404 or 301 redirect to the appropriate variation, which I think is probably most developers preference/intent, as it has a beneficial impact on SEO for the site.
Appreciate any feedback from the community here.
Thanks! -VoxX
After toying around with this a bit, it seems if you nix the index.php out of the requested url via .htaccess RewriteRule, the controllers handle the incoming request much more as I think most would expect them to.
/public/.htaccess
RewriteCond %{THE_REQUEST} /index\.php [NC]
RewriteRule ^(.*?)index\.php(?:\/(.*))?$ /$1$2 [L,R=307,NC,NE]
** R=301 more appropriate, just using 307 for testing.
I realize stripping index.php out of the url may be disadvantageous in certain circumstances (where scripts/libraries rely upon it being in the url for querystring purposes), but this appears to be a pretty simple way to clean up not only the index.php duplicate urls, but also all the undefined prefix / suffix variations that were previously being mistakenly 'hijacked' by inappropriate controllers and resolving instead of 404ing.
How are others tackling this inconsistent handling of routes out of the box in L5?
-VoxX
Sign in to participate in this thread!
The Laravel portal for problem solving, knowledge sharing and community building.
The community