Table filled in from the top 10m most popular web-sites.
- List Quality
- This list is prioritized on page rank vs traffic
- May include both inactive or redirected sites
- Does not reflect actual traffic or views
- Other lists that might be interesting:
- This list is prioritized on page rank vs traffic
https://s3-us-west-1.amazonaws.com/umbrella-static/index.html https://www.commoncrawl.org/ https://tranco-list.eu/
- Not that concerned if this list is exact
- Should provide good sampling of top sites
- Order really isn't important in this case
- This domain is decades old and not in the list!
- Don't get much traffic, but 10M+??
- Good performance test of a database table with 10M rows
- Cold load times take several seconds when paged out
- Well indexed but not performing as well as exepected
- Much larger tables have perfomed much better
- Appears like it is doing a full table-scan
- This database is running in a SQL Server under docker.
- Will load the same dataset into postgres for comparison.
- Started crawling the home page for the first 1M domains.
- Interested in stats on use of html5, proper html, etc..
- Started with the first 100K sites and expanding to first 1M.
- Observations
- Surprising number of domains without a proper html lang attr
- Surprising number of domains not using a proper HTML 5 doctype
- The domains are naked without fully qualified with hostname.
- Many domains don't have a dns entry for the naked domain.
- domain.com vs www.domain.com which should redirect.
- Suprising number of SSL errors on the naked domain
- Clients can't connect without dropping ssl verification
- With standard security checks clients will never get a redirect
- domains that redirect behind an invalid ssl cert
- this is super easy to fix/park to handle redirects.
- lost revenue with that much linking to get on this list
Selected Top Domains
skip report
# | Domain | Sort | Rank | Status | Flags |
---|---|---|---|---|---|
19201 | asic.gov.au | 22393 | 4.86 | 200 | HTML 5, English |
19202 | reginfo.gov | 22394 | 4.86 | 200 | English |
19203 | dalkurd.se | 22396 | 4.86 | 200 | HTML 5, English |
19204 | ine.gub.uy | 22398 | 4.86 | 200 | HTML 5 |
19205 | hotels.ng | 22399 | 4.86 | 200 | HTML 5, English |
19206 | hockey-reference.com | 22400 | 4.86 | 200 | HTML 5, English |
19207 | genymotion.com | 22401 | 4.86 | 200 | HTML 5, English |
19208 | rcsed.ac.uk | 22402 | 4.86 | 200 | HTML 5, English |
19209 | opennode.com | 22403 | 4.86 | 200 | HTML 5, No Lang |
19210 | agcocorp.com | 22405 | 4.86 | 200 | HTML 5, English |
19211 | community.tableau.com | 22406 | 4.86 | 200 | HTML 5, English |
19212 | aacap.org | 22407 | 4.86 | 200 | HTML 5, English |
19213 | acsac.org | 22408 | 4.86 | 200 | HTML 5, English |
19214 | baike.sogou.com | 22409 | 4.86 | 200 | HTML 5, No Lang |
19215 | nikon.com | 22411 | 4.86 | 200 | HTML 5, English |
19216 | vaccines.gov | 22412 | 4.86 | 200 | HTML 5, No Lang |
19217 | scribu.net | 22414 | 4.86 | 200 | HTML 5, English |
19218 | sweatco.in | 22415 | 4.86 | 200 | HTML 5, No Lang |
19219 | support.novell.com | 22416 | 4.86 | 200 | No Lang |
19220 | nrega.nic.in | 22417 | 4.86 | 200 | HTML 5, No Lang |
19221 | freebsdfoundation.org | 22418 | 4.86 | 200 | HTML 5, English |
19222 | scaleflex.com | 22420 | 4.86 | 200 | HTML 5, English |
19223 | pugetsystems.com | 22421 | 4.86 | 200 | HTML 5, English |
19224 | spotthestation.nasa.gov | 22422 | 4.86 | 200 | HTML 5, English |
19225 | abril.com.br | 22423 | 4.86 | 200 | HTML 5 |
19226 | brlcad.org | 22424 | 4.86 | 200 | HTML 5, English |
19227 | artofthetitle.com | 22425 | 4.86 | 200 | HTML 5, English |
19228 | amrita.edu | 22426 | 4.86 | 200 | HTML 5, English |
19229 | shopee.ph | 22427 | 4.86 | 200 | HTML 5, No Lang |
19230 | modelviewculture.com | 22428 | 4.86 | 200 | HTML 5, English |
19231 | linux.it | 22429 | 4.86 | 200 | HTML 5 |
19232 | bchydro.com | 22430 | 4.86 | 200 | English, Strict |
19233 | wisconsinhistory.org | 22431 | 4.86 | 200 | HTML 5, English |
19234 | in.news.yahoo.com | 22432 | 4.86 | 200 | HTML 5, English |
19235 | gurtam.com | 22433 | 4.86 | 200 | HTML 5, English |
19236 | movies2.nytimes.com | 22434 | 4.86 | 200 | HTML 5, English |
19237 | toolbarextras.com | 22437 | 4.86 | 200 | HTML 5, English |
19238 | electrifyamerica.com | 22438 | 4.86 | 200 | HTML 5, English |
19239 | forums.oneplus.com | 22439 | 4.86 | 200 | HTML 5, English |
19240 | colourpop.com | 22440 | 4.86 | 200 | HTML 5, English |
19241 | winpcap.org | 22442 | 4.86 | 200 | English, Strict |
19242 | one.org | 22443 | 4.86 | 200 | HTML 5, English |
19243 | jollyrogertelephone.com | 22444 | 4.86 | 200 | HTML 5, English |
19244 | medievalists.net | 22446 | 4.86 | 200 | HTML 5, English |
19245 | donews.com | 22448 | 4.86 | 200 | HTML 5, English |
19246 | codestag.com | 22449 | 4.86 | 200 | HTML 5, English |
19247 | uncpress.org | 22450 | 4.86 | 200 | HTML 5, English |
19248 | cifar.ca | 22451 | 4.86 | 200 | HTML 5, English |
19249 | bgc.bard.edu | 22452 | 4.86 | 200 | HTML 5, No Lang |
19250 | vor.at | 22453 | 4.86 | 200 | HTML 5, No Lang |
19251 | techlog360.com | 22454 | 4.86 | 200 | English |
19252 | apeda.gov.in | 22455 | 4.86 | 200 | HTML 5, English |
19253 | clerk.house.gov | 22456 | 4.86 | 200 | HTML 5, English |
19254 | tabletopia.com | 22457 | 4.86 | 200 | HTML 5, No Lang |
19255 | chateauversailles.fr | 22458 | 4.86 | 200 | HTML 5 |
19256 | registraduria.gov.co | 22459 | 4.86 | 200 | |
19257 | foodsharing.de | 22460 | 4.86 | 200 | HTML 5 |
19258 | community.auth0.com | 22462 | 4.86 | 200 | HTML 5, English |
19259 | coralproject.net | 22463 | 4.86 | 200 | HTML 5, English |
19260 | myopportunity.com | 22464 | 4.86 | 200 | HTML 5, No Lang |
19261 | agilecrm.com | 22465 | 4.86 | 200 | HTML 5, English |
19262 | gp.se | 22466 | 4.86 | 200 | HTML 5 |
19263 | phrases.org.uk | 22467 | 4.86 | 200 | HTML 5, English |
19264 | wpostats.com | 22468 | 4.86 | 200 | HTML 5, English |
19265 | ph.ucla.edu | 22469 | 4.86 | 200 | HTML 5, English |
19266 | dumps.wikimedia.org | 22470 | 4.86 | 200 | English, Transitional |
19267 | pluginkollektiv.org | 22471 | 4.86 | 200 | HTML 5, English |
19268 | vocab.getty.edu | 22472 | 4.86 | 200 | HTML 5, No Lang |
19269 | sandiegounified.org | 22473 | 4.86 | 200 | HTML 5, English |
19270 | sudarmuthu.com | 22474 | 4.86 | 200 | HTML 5, English |
19271 | newsroom.co.nz | 22475 | 4.86 | 200 | HTML 5, English |
19272 | bmcmedicine.biomedcentral.com | 22476 | 4.86 | 200 | HTML 5, English |
19273 | fsb.org | 22477 | 4.86 | 200 | HTML 5, English |
19274 | cafelog.com | 22478 | 4.86 | 200 | HTML 5 |
19275 | bidsquare.com | 22479 | 4.86 | 200 | HTML 5, English |
19276 | ego.gov.tr | 22480 | 4.86 | 200 | HTML 5, No Lang |
19277 | canalextremadura.es | 22481 | 4.86 | 200 | HTML 5 |
19278 | mof.gov.sa | 22482 | 4.86 | 200 | HTML 5 |
19279 | plugins.gradle.org | 22483 | 4.86 | 200 | HTML 5, No Lang |
19280 | sweg.de | 22484 | 4.86 | 200 | HTML 5 |
19281 | publicpower.org | 22486 | 4.86 | 200 | HTML 5, English |
19282 | annevankesteren.nl | 22487 | 4.86 | 200 | HTML 5, No Lang |
19283 | lynk.id | 22488 | 4.86 | 200 | HTML 5, English |
19284 | ninewest.com | 22489 | 4.86 | 200 | HTML 5, English |
19285 | geohack.toolforge.org | 22490 | 4.86 | 200 | No Lang |
19286 | edu.casio.com | 22491 | 4.86 | 200 | HTML 5 |
19287 | ana.net | 22493 | 4.86 | 200 | HTML 5, English |
19288 | emerce.nl | 22494 | 4.86 | 200 | HTML 5 |
19289 | sr.wordpress.org | 22495 | 4.86 | 200 | HTML 5 |
19290 | shanghaiist.com | 22496 | 4.86 | 200 | HTML 5, English |
19291 | salk.edu | 22498 | 4.86 | 200 | HTML 5, English |
19292 | ajaydsouza.com | 22499 | 4.86 | 200 | HTML 5, English |
19293 | quandl.com | 22500 | 4.86 | 200 | HTML 5, English |
19294 | reneweconomy.com.au | 22501 | 4.86 | 200 | HTML 5, English |
19295 | rock-im-park.com | 22503 | 4.86 | 200 | HTML 5 |
19296 | lambda.gsfc.nasa.gov | 22505 | 4.86 | 200 | No Lang, Transitional |
19297 | apollo.io | 22506 | 4.86 | 200 | HTML 5, English |
19298 | msu.ru | 22507 | 4.86 | 200 | HTML 5 |
19299 | newspaperarchive.com | 22508 | 4.86 | 200 | HTML 5, English |
19300 | ip-paris.fr | 22509 | 4.86 | 200 | HTML 5 |
Data from: Open PageRank