Table filled in from the top 10m most popular web-sites.
- List Quality
- This list is prioritized on page rank vs traffic
- May include both inactive or redirected sites
- Does not reflect actual traffic or views
- Other lists that might be interesting:
- This list is prioritized on page rank vs traffic
https://s3-us-west-1.amazonaws.com/umbrella-static/index.html https://www.commoncrawl.org/ https://tranco-list.eu/
- Not that concerned if this list is exact
- Should provide good sampling of top sites
- Order really isn't important in this case
- This domain is decades old and not in the list!
- Don't get much traffic, but 10M+??
- Good performance test of a database table with 10M rows
- Cold load times take several seconds when paged out
- Well indexed but not performing as well as exepected
- Much larger tables have perfomed much better
- Appears like it is doing a full table-scan
- This database is running in a SQL Server under docker.
- Will load the same dataset into postgres for comparison.
- Started crawling the home page for the first 1M domains.
- Interested in stats on use of html5, proper html, etc..
- Started with the first 100K sites and expanding to first 1M.
- Observations
- Surprising number of domains without a proper html lang attr
- Surprising number of domains not using a proper HTML 5 doctype
- The domains are naked without fully qualified with hostname.
- Many domains don't have a dns entry for the naked domain.
- domain.com vs www.domain.com which should redirect.
- Suprising number of SSL errors on the naked domain
- Clients can't connect without dropping ssl verification
- With standard security checks clients will never get a redirect
- domains that redirect behind an invalid ssl cert
- this is super easy to fix/park to handle redirects.
- lost revenue with that much linking to get on this list
Selected Top Domains
skip report
# | Domain | Sort | Rank | Status | Flags |
---|---|---|---|---|---|
17401 | thecreativepenn.com | 20288 | 4.88 | 200 | HTML 5, English |
17402 | ximalaya.com | 20289 | 4.88 | 200 | HTML 5, No Lang |
17403 | gtrusted.com | 20290 | 4.88 | 200 | HTML 5, English |
17404 | catalog.update.microsoft.com | 20291 | 4.88 | 200 | English, Transitional |
17405 | airwallex.com | 20292 | 4.88 | 200 | HTML 5, English |
17406 | rollingstone.fr | 20293 | 4.88 | 200 | HTML 5 |
17407 | clipstudio.net | 20294 | 4.88 | 200 | HTML 5 |
17408 | fcrr.org | 20295 | 4.88 | 200 | HTML 5, English |
17409 | cvc.cervantes.es | 20296 | 4.88 | 200 | Strict |
17410 | timetopet.com | 20297 | 4.88 | 200 | HTML 5, English |
17411 | webvr.rocks | 20298 | 4.88 | 200 | HTML 5, English |
17412 | idsociety.org | 20299 | 4.88 | 200 | HTML 5, English |
17413 | hbf.com.au | 20300 | 4.88 | 200 | HTML 5, English |
17414 | thechronicleherald.ca | 20301 | 4.88 | 200 | HTML 5, No Lang |
17415 | dmi.dk | 20302 | 4.88 | 200 | HTML 5 |
17416 | c82.net | 20304 | 4.88 | 200 | HTML 5, No Lang |
17417 | twinery.org | 20305 | 4.88 | 200 | HTML 5, English |
17418 | iono.fm | 20306 | 4.88 | 200 | HTML 5, English |
17419 | nauticalcharts.noaa.gov | 20308 | 4.88 | 200 | English, Strict |
17420 | cse-cst.gc.ca | 20309 | 4.88 | 200 | HTML 5, English |
17421 | cms.megaphone.fm | 20311 | 4.88 | 200 | HTML 5, No Lang |
17422 | discoverlancaster.com | 20312 | 4.88 | 200 | HTML 5, English |
17423 | dailymed.nlm.nih.gov | 20314 | 4.88 | 200 | HTML 5, English |
17424 | crackerbarrel.com | 20315 | 4.88 | 200 | HTML 5, English |
17425 | puzzazz.com | 20317 | 4.88 | 200 | No Lang, Transitional |
17426 | quatrorodas.abril.com.br | 20318 | 4.88 | 200 | HTML 5 |
17427 | eschoolnews.com | 20319 | 4.88 | 200 | HTML 5, English |
17428 | gallery.yopriceville.com | 20320 | 4.88 | 200 | HTML 5, English |
17429 | mothership.sg | 20321 | 4.88 | 200 | HTML 5, No Lang |
17430 | plutobooks.com | 20322 | 4.88 | 200 | HTML 5, English |
17431 | cmo.com | 20323 | 4.88 | 200 | HTML 5, No Lang |
17432 | ayrshare.com | 20324 | 4.88 | 200 | HTML 5, English |
17433 | pulsar-nv.com | 20325 | 4.88 | 200 | HTML 5, English |
17434 | civic.mit.edu | 20326 | 4.88 | 200 | HTML 5, English |
17435 | gresham.ac.uk | 20328 | 4.88 | 200 | HTML 5, English |
17436 | pjmedia.com | 20329 | 4.88 | 200 | HTML 5, English |
17437 | open-ui.org | 20330 | 4.88 | 200 | HTML 5, English |
17438 | livesoccertv.com | 20331 | 4.88 | 200 | HTML 5, English |
17439 | charismaticplanet.com | 20332 | 4.88 | 200 | English |
17440 | kids.britannica.com | 20334 | 4.88 | 200 | English |
17441 | ecosystema.ru | 20335 | 4.88 | 200 | No Lang |
17442 | d.tube | 20336 | 4.88 | 200 | HTML 5, No Lang |
17443 | themes.shopify.com | 20337 | 4.88 | 200 | HTML 5, English |
17444 | day.js.org | 20338 | 4.88 | 200 | HTML 5, English |
17445 | uk.movies.yahoo.com | 20340 | 4.88 | 200 | HTML 5, No Lang |
17446 | radioandmusic.com | 20341 | 4.88 | 200 | English |
17447 | criminalip.io | 20343 | 4.88 | 200 | HTML 5, English |
17448 | bugguide.net | 20344 | 4.88 | 200 | English, Transitional |
17449 | taenk.dk | 20345 | 4.88 | 200 | HTML 5 |
17450 | skyhorsepublishing.com | 20346 | 4.88 | 200 | HTML 5, English |
17451 | midea.com | 20347 | 4.88 | 200 | HTML 5, English |
17452 | newsroom.heart.org | 20348 | 4.88 | 200 | HTML 5, English |
17453 | guidebook.com | 20349 | 4.88 | 200 | HTML 5, English |
17454 | cos.io | 20350 | 4.88 | 200 | HTML 5, English |
17455 | kount.com | 20351 | 4.88 | 200 | HTML 5, English |
17456 | stadt-bremerhaven.de | 20352 | 4.88 | 200 | HTML 5 |
17457 | correctiv.org | 20353 | 4.88 | 200 | HTML 5 |
17458 | music.amazon.co.uk | 20354 | 4.88 | 200 | HTML 5, English |
17459 | biobiochile.cl | 20355 | 4.88 | 200 | HTML 5 |
17460 | voegol.com.br | 20356 | 4.88 | 200 | HTML 5 |
17461 | 9-11commission.gov | 20357 | 4.88 | 200 | English, Transitional |
17462 | nashvillescene.com | 20358 | 4.88 | 200 | HTML 5, English |
17463 | help.disneyplus.com | 20360 | 4.88 | 200 | HTML 5, English |
17464 | cntraveller.com | 20362 | 4.88 | 200 | HTML 5, English |
17465 | marketingmag.ca | 20363 | 4.88 | 200 | No Lang, Transitional |
17466 | uv.mx | 20364 | 4.88 | 200 | HTML 5 |
17467 | whisk.com | 20365 | 4.88 | 200 | HTML 5, English |
17468 | inven.co.kr | 20366 | 4.88 | 200 | HTML 5 |
17469 | thehealthcareblog.com | 20367 | 4.88 | 200 | HTML 5, English |
17470 | github.co | 20368 | 4.88 | 200 | HTML 5, English |
17471 | condenaststore.com | 20369 | 4.88 | 200 | HTML 5, English |
17472 | wqad.com | 20370 | 4.88 | 200 | HTML 5, English |
17473 | tinkerlab.com | 20371 | 4.88 | 200 | HTML 5, English |
17474 | plugins.trac.wordpress.org | 20373 | 4.88 | 200 | No Lang, Strict |
17475 | laverne.edu | 20375 | 4.88 | 200 | HTML 5, English |
17476 | exim.org | 20376 | 4.88 | 200 | English |
17477 | meduniwien.ac.at | 20380 | 4.88 | 200 | HTML 5 |
17478 | stevens.edu | 20381 | 4.88 | 200 | HTML 5, English |
17479 | 12factor.net | 20382 | 4.88 | 200 | HTML 5, English |
17480 | app.websitepolicies.com | 20383 | 4.88 | 200 | HTML 5, English |
17481 | liverpoolfc.com | 20384 | 4.88 | 200 | HTML 5, English |
17482 | universaldependencies.org | 20385 | 4.88 | 200 | English, Transitional |
17483 | canaltech.com.br | 20387 | 4.88 | 200 | HTML 5 |
17484 | biztechmagazine.com | 20388 | 4.88 | 200 | HTML 5, English |
17485 | appian.com | 20389 | 4.88 | 200 | HTML 5, English |
17486 | irinnews.org | 20390 | 4.88 | 200 | HTML 5, English |
17487 | kgmservizi.com | 20391 | 4.88 | 200 | HTML 5 |
17488 | pe.com | 20392 | 4.88 | 200 | HTML 5, English |
17489 | learningforjustice.org | 20393 | 4.88 | 200 | HTML 5, English |
17490 | drdemento.com | 20395 | 4.88 | 200 | No Lang |
17491 | wpeverest.com | 20396 | 4.88 | 200 | HTML 5, English |
17492 | linux-apps.com | 20397 | 4.88 | 200 | HTML 5, English |
17493 | heathceramics.com | 20398 | 4.88 | 200 | HTML 5, English |
17494 | brightid.org | 20399 | 4.88 | 200 | HTML 5, English |
17495 | nordot.app | 20401 | 4.88 | 200 | HTML 5, No Lang |
17496 | skitch.com | 20402 | 4.88 | 200 | HTML 5, English |
17497 | privacy.thewaltdisneycompany.com | 20403 | 4.88 | 200 | HTML 5, English |
17498 | ie.edu | 20404 | 4.88 | 200 | HTML 5, English |
17499 | jewishjournal.com | 20405 | 4.88 | 200 | HTML 5, English |
17500 | nporadio1.nl | 20406 | 4.88 | 200 | HTML 5 |
Data from: Open PageRank