Table filled in from the top 10m most popular web-sites.
- List Quality
- This list is prioritized on page rank vs traffic
- May include both inactive or redirected sites
- Does not reflect actual traffic or views
- Other lists that might be interesting:
- This list is prioritized on page rank vs traffic
https://s3-us-west-1.amazonaws.com/umbrella-static/index.html https://www.commoncrawl.org/ https://tranco-list.eu/
- Not that concerned if this list is exact
- Should provide good sampling of top sites
- Order really isn't important in this case
- This domain is decades old and not in the list!
- Don't get much traffic, but 10M+??
- Good performance test of a database table with 10M rows
- Cold load times take several seconds when paged out
- Well indexed but not performing as well as exepected
- Much larger tables have perfomed much better
- Appears like it is doing a full table-scan
- This database is running in a SQL Server under docker.
- Will load the same dataset into postgres for comparison.
- Started crawling the home page for the first 1M domains.
- Interested in stats on use of html5, proper html, etc..
- Started with the first 100K sites and expanding to first 1M.
- Observations
- Surprising number of domains without a proper html lang attr
- Surprising number of domains not using a proper HTML 5 doctype
- The domains are naked without fully qualified with hostname.
- Many domains don't have a dns entry for the naked domain.
- domain.com vs www.domain.com which should redirect.
- Suprising number of SSL errors on the naked domain
- Clients can't connect without dropping ssl verification
- With standard security checks clients will never get a redirect
- domains that redirect behind an invalid ssl cert
- this is super easy to fix/park to handle redirects.
- lost revenue with that much linking to get on this list
Selected Top Domains
skip report
# | Domain | Sort | Rank | Status | Flags |
---|---|---|---|---|---|
18401 | mit.zoom.us | 21446 | 4.87 | 200 | HTML 5, English |
18402 | mat.univie.ac.at | 21447 | 4.87 | 200 | No Lang |
18403 | wma.net | 21448 | 4.87 | 200 | HTML 5, English |
18404 | astro.com | 21449 | 4.87 | 200 | HTML 5, English |
18405 | whimsical.com | 21450 | 4.87 | 200 | HTML 5, English |
18406 | hola.org | 21451 | 4.87 | 200 | HTML 5, English |
18407 | fietsersbond.nl | 21452 | 4.87 | 200 | HTML 5 |
18408 | account.authorize.net | 21453 | 4.87 | 200 | No Lang |
18409 | ids.ac.uk | 21454 | 4.87 | 200 | HTML 5, English |
18410 | casavogue.globo.com | 21455 | 4.87 | 200 | HTML 5 |
18411 | bugs.sun.com | 21456 | 4.87 | 200 | No Lang |
18412 | true.th | 21458 | 4.87 | 200 | HTML 5 |
18413 | scholarship.law.duke.edu | 21459 | 4.87 | 200 | HTML 5, English |
18414 | teslauniverse.com | 21460 | 4.87 | 200 | HTML 5, English |
18415 | ling.auf.net | 21461 | 4.87 | 200 | English, Transitional |
18416 | login.siteground.com | 21462 | 4.87 | 200 | HTML 5, No Lang |
18417 | digitalcommons.law.yale.edu | 21463 | 4.87 | 200 | HTML 5, English |
18418 | cheatsheet.com | 21464 | 4.87 | 200 | HTML 5, English |
18419 | picandocodigo.net | 21466 | 4.87 | 200 | HTML 5 |
18420 | libres.uncg.edu | 21467 | 4.87 | 200 | No Lang, Transitional |
18421 | za.linkedin.com | 21468 | 4.87 | 200 | HTML 5, English |
18422 | fashion.telegraph.co.uk | 21469 | 4.87 | 200 | HTML 5, English |
18423 | news.arizona.edu | 21471 | 4.87 | 200 | HTML 5, English |
18424 | hotspotshield.com | 21472 | 4.87 | 200 | HTML 5, English |
18425 | digital.library.upenn.edu | 21473 | 4.87 | 200 | English, Transitional |
18426 | mintaka.sdsu.edu | 21475 | 4.87 | 200 | HTML 5, English |
18427 | aminer.org | 21476 | 4.87 | 200 | English |
18428 | webwire.com | 21478 | 4.87 | 200 | No Lang, Transitional |
18429 | gympass.com | 21479 | 4.87 | 200 | HTML 5, English |
18430 | kctv5.com | 21480 | 4.87 | 200 | HTML 5, English |
18431 | wkdq.com | 21481 | 4.87 | 200 | HTML 5, English |
18432 | lineicons.com | 21482 | 4.87 | 200 | HTML 5, English |
18433 | embryo.asu.edu | 21483 | 4.87 | 200 | HTML 5, English |
18434 | donki.com | 21484 | 4.87 | 200 | HTML 5 |
18435 | metroweekly.com | 21485 | 4.87 | 200 | HTML 5, English |
18436 | sonyliv.com | 21486 | 4.87 | 200 | HTML 5, English |
18437 | mansfieldnewsjournal.com | 21487 | 4.87 | 200 | HTML 5, English |
18438 | censys.io | 21489 | 4.87 | 200 | HTML 5, English |
18439 | palagems.com | 21490 | 4.87 | 200 | HTML 5, English |
18440 | micropub.net | 21491 | 4.87 | 200 | HTML 5, No Lang |
18441 | homepage.univie.ac.at | 21492 | 4.87 | 200 | HTML 5 |
18442 | globalwordnet.org | 21494 | 4.87 | 200 | HTML 5, English |
18443 | bizible.com | 21495 | 4.87 | 200 | HTML 5, No Lang |
18444 | news.163.com | 21496 | 4.87 | 200 | HTML 5, No Lang |
18445 | uwindsor.ca | 21497 | 4.87 | 200 | English |
18446 | textrequest.com | 21498 | 4.87 | 200 | HTML 5, English |
18447 | stuttgarter-zeitung.de | 21499 | 4.87 | 200 | HTML 5 |
18448 | vs.inf.ethz.ch | 21500 | 4.87 | 200 | No Lang, Transitional |
18449 | wcities.com | 21501 | 4.87 | 200 | HTML 5, English |
18450 | perthnow.com.au | 21502 | 4.87 | 200 | HTML 5, English |
18451 | tecartabible.com | 21503 | 4.87 | 200 | HTML 5, No Lang |
18452 | tribecafilm.com | 21504 | 4.87 | 200 | HTML 5, No Lang |
18453 | insider.office.com | 21505 | 4.87 | 200 | HTML 5, English |
18454 | life.pravda.com.ua | 21506 | 4.87 | 200 | HTML 5 |
18455 | inboxdollars.com | 21508 | 4.87 | 200 | HTML 5, English |
18456 | globalfundforwomen.org | 21509 | 4.87 | 200 | HTML 5, English |
18457 | zoe.com | 21510 | 4.87 | 200 | HTML 5, English |
18458 | becomingminimalist.com | 21511 | 4.87 | 200 | HTML 5, English |
18459 | iso639-3.sil.org | 21512 | 4.87 | 200 | English |
18460 | drugpolicy.org | 21513 | 4.87 | 200 | HTML 5, English |
18461 | guava.dev | 21514 | 4.87 | 200 | HTML 5, No Lang |
18462 | lauritz.com | 21515 | 4.87 | 200 | HTML 5 |
18463 | 24-7pressrelease.com | 21516 | 4.87 | 200 | HTML 5, English |
18464 | tuttonapoli.net | 21517 | 4.87 | 200 | HTML 5 |
18465 | briansolis.com | 21518 | 4.87 | 200 | HTML 5, English |
18466 | genealogieonline.nl | 21519 | 4.87 | 200 | HTML 5 |
18467 | pythonbytes.fm | 21520 | 4.87 | 200 | HTML 5, English |
18468 | ias.edu | 21521 | 4.87 | 200 | HTML 5, English |
18469 | herald.co.zw | 21522 | 4.87 | 200 | HTML 5, English |
18470 | divessi.com | 21523 | 4.87 | 200 | HTML 5, English |
18471 | kortrijk.be | 21524 | 4.87 | 200 | HTML 5 |
18472 | lsvp.com | 21526 | 4.87 | 200 | HTML 5, English |
18473 | internet2.edu | 21528 | 4.87 | 200 | HTML 5, English |
18474 | seu-e.cat | 21529 | 4.87 | 200 | HTML 5, No Lang |
18475 | onmsft.com | 21530 | 4.87 | 200 | HTML 5, English |
18476 | c-spanvideo.org | 21531 | 4.87 | 200 | HTML 5, English |
18477 | tudn.com | 21532 | 4.87 | 200 | HTML 5 |
18478 | slf4j.org | 21533 | 4.87 | 200 | HTML 5, English |
18479 | rethinkingschools.org | 21534 | 4.87 | 200 | HTML 5, English |
18480 | tv-tokyo.co.jp | 21535 | 4.87 | 200 | HTML 5 |
18481 | mobilepay.fi | 21536 | 4.87 | 200 | HTML 5 |
18482 | cockroachlabs.com | 21537 | 4.87 | 200 | HTML 5, English |
18483 | vikatan.com | 21538 | 4.87 | 200 | HTML 5 |
18484 | ignitiondeck.com | 21539 | 4.87 | 200 | HTML 5, English |
18485 | smartpakequine.com | 21540 | 4.87 | 200 | HTML 5, English |
18486 | vgchartz.com | 21541 | 4.87 | 200 | HTML 5, English |
18487 | greifswald.de | 21542 | 4.87 | 200 | HTML 5 |
18488 | esolangs.org | 21543 | 4.87 | 200 | HTML 5, English |
18489 | intelligencesquared.com | 21544 | 4.87 | 200 | HTML 5, English |
18490 | nndb.com | 21545 | 4.87 | 200 | No Lang |
18491 | xsens.com | 21546 | 4.87 | 200 | HTML 5, No Lang |
18492 | creema.jp | 21548 | 4.87 | 200 | HTML 5 |
18493 | aber.ac.uk | 21549 | 4.87 | 200 | HTML 5, English |
18494 | fsymbols.com | 21550 | 4.87 | 200 | HTML 5, English |
18495 | robots.ox.ac.uk | 21551 | 4.87 | 200 | English, Strict |
18496 | qualifications.pearson.com | 21552 | 4.87 | 200 | HTML 5, English |
18497 | clios.com | 21553 | 4.87 | 200 | HTML 5, English |
18498 | philpeople.org | 21554 | 4.87 | 200 | HTML 5, English |
18499 | federalnewsradio.com | 21555 | 4.87 | 200 | HTML 5, English |
18500 | khmertimeskh.com | 21556 | 4.87 | 200 | HTML 5, English |
Data from: Open PageRank