labunix's blog

labunixのラボUnix

「悪質な海外ウェブサイト一覧」のHTTPステータスコードを取得してみた。

■「悪質な海外ウェブサイト一覧」のHTTPステータスコードを取得してみた。
 以下の続き

 「悪質な海外ウェブサイト一覧」の名前が引けるか試してみた。
 http://labunix.hateblo.jp/entry/20141128/1417100814

■ところで2種類のURLが、ワンライナーで取得出来ていなかった上に、
 「go.jp」まで取得してしまっていた。

$ pdfgrep "ony|http" "141031adjustments_1.pdf" | \
  sed s%".*\(http://\)"%"\n\1"%g | awk -F/ '/http/ {print $3}' | \
  grep -v "go.jp\|^\$" | nl -w 3 -n rz | grep ",\|ony"
186	www.onyxalliance.org
198	www,gucchiya.com

■ちなみに重複するURLがあるけどスルーします。
 消費者庁がちゃんと精査しているのか心配になりますね。。。

$ pdfgrep "ony|http" "141031adjustments_1.pdf" | \
  sed s%".*\(http://\)"%"\n\1"%g | awk -F/ '/http/ {print $3}' | \
  grep -v "go.jp\|^\$" | sort | uniq -c | awk '($1>1){print}'
      2 www.brandedcorner.com
      2 www.jpzizi.com
      2 www.renewingfaces.com
      2 www.urbntouch.com

■以下のようにして後は待つだけ。
 プロキシ経由ながら、w3mにタイムアウトの指定が出来ないので時間がかかる。。。
 待っているのはDNSタイムアウトかな。。。
 「302 Moved Temporarily」がどこに飛ぶのか気になりますね。

$ pdfgrep "ony|http" "141031adjustments_1.pdf" | \
  sed s%".*\(http://\)"%"\n\1"%g | awk -F/ '/http/ {print $3}' | \
  grep -v "go.jp\|^\$" | tr ',' '.' | nl -n rz -w 3 | \
  while read line ;do \
    echo "$line" 2>/dev/null | awk '{printf "\""$1"\",\""$2"\","}'; \
    echo "$line" | awk '{print $2}' | \
      w3m -dump_head "`xargs`" 2> /dev/null | awk '/^HTTP\// {print "\""$0"\""}'; \
  done

"001","www.guccimenjpsale.com","HTTP/1.0 200 OK"
"002","www.bestmonclerjp.com","HTTP/1.0 200 OK"
"003","www.nikeja.com","HTTP/1.0 403 Forbidden"
"004","www.hanbainihon.com","HTTP/1.0 504 Gateway Time-out"
"005","www.lucas-lvbag.com","HTTP/1.0 504 Gateway Time-out"
"006","www.coachgardens.com","HTTP/1.0 504 Gateway Time-out"
"007","gutebaby.com","HTTP/1.0 200 OK"
"008","www.sneakersjapanu.com","HTTP/1.0 200 OK"
"009","www.coachhot2013.com","HTTP/1.0 504 Gateway Time-out"
"010","www.nikesjpz.com","HTTP/1.0 200 OK"
"011","www.nikeadidasjpz.com","HTTP/1.0 504 Gateway Time-out"
"012","aovins.com","HTTP/1.0 200 OK"
"013","www.nikeonlinejapan.com","HTTP/1.0 504 Gateway Time-out"
"014","www.coachya2013.com","HTTP/1.0 504 Gateway Time-out"
"015","www.rakutenku.com","HTTP/1.0 200 OK"
"016","www.chibauni.com","HTTP/1.0 200 OK"
"017","styleja.com","HTTP/1.0 200 OK"
"018","jennus.com","HTTP/1.0 200 OK"
"019","www.saifudesigner.com","HTTP/1.0 200 OK"
"020","www.becausetheyknow.com","HTTP/1.0 504 Gateway Time-out"
"021","www.pandplimo.com","HTTP/1.0 504 Gateway Time-out"
"022","i-you-i.com","HTTP/1.0 504 Gateway Time-out"
"023","www.vanlai.com","HTTP/1.0 504 Gateway Time-out"
"024","kikuku.net","HTTP/1.0 504 Gateway Time-out"
"025","www.ugg-kakaku.com","HTTP/1.0 403 Forbidden"
"026","www.toribachija.com","HTTP/1.0 504 Gateway Time-out"
"027","kutu.fenyastravels.com","HTTP/1.0 200 OK"
"028","www.tuffymaps.com","HTTP/1.0 504 Gateway Time-out"
"029","www.diogene99.com","HTTP/1.0 200 OK"
"030","www.バーバリーアウトレット.com","HTTP/1.0 400 Bad Request"
"031","www.blgoddard.com","HTTP/1.0 504 Gateway Time-out"
"032","www.2013guccistorejp.com","HTTP/1.0 200 OK"
"033","www.shinesneaker.com","HTTP/1.0 200 OK"
"034","www.brepli111.com","HTTP/1.0 504 Gateway Time-out"
"035","www.smetn.com","HTTP/1.0 200 OK"
"036","www.m1700.com","HTTP/1.0 403 Forbidden"
"037","www.uggboots.jp","HTTP/1.0 200 OK"
"038","nutrired.org","HTTP/1.0 200 OK"
"039","abassjp.com","HTTP/1.0 200 OK"
"040","www.enlvs.com","HTTP/1.0 200 OK"
"041","www.top-kopi.net","HTTP/1.0 200 OK"
"042","emails.brandheya.com","HTTP/1.0 504 Gateway Time-out"
"043","brandheya.com","HTTP/1.0 200 OK"
"044","brandheyajp.com","HTTP/1.0 200 OK"
"045","www.brandheyajp.com","HTTP/1.0 200 OK"
"046","www.ktokopi.com","HTTP/1.0 504 Gateway Time-out"
"047","louis-360.com","HTTP/1.0 500 Internal Server Error"
"048","www.burberrybluelabeljpsale.com","HTTP/1.0 504 Gateway Time-out"
"049","www.celine-japan.com","HTTP/1.0 504 Gateway Time-out"
"050","www.pickgolfup.com","HTTP/1.0 200 OK"
"051","www.newmoncleroutletjpn.com","HTTP/1.0 200 OK"
"052","www.lvbagonsale.com","HTTP/1.0 200 OK"
"053","www.aftpq.com","HTTP/1.0 200 OK"
"054","jp.wek7.org","HTTP/1.0 504 Gateway Time-out"
"055","www.toshop-jp.com","HTTP/1.0 504 Gateway Time-out"
"056","www.monclerdown2014.com","HTTP/1.0 504 Gateway Time-out"
"057","www.moncler-outlets.net","HTTP/1.0 504 Gateway Time-out"
"058","www.montblanc-ballpen.com","HTTP/1.0 504 Gateway Time-out"
"059","www.brandedcorner.com","HTTP/1.0 200 OK"
"060","www.eurocentrichandbags.com","HTTP/1.0 200 OK"
"061","www.hamalibg.biz","HTTP/1.0 302 Moved Temporarily"
"062","cartoolplaza.com","HTTP/1.0 504 Gateway Time-out"
"063","www.c-web.biz","HTTP/1.0 302 Moved Temporarily"
"064","saleonlinejapan.com","HTTP/1.0 200 OK"
"065","www.casio.l1ids.org","HTTP/1.0 504 Gateway Time-out"
"066","www.monopolysalejp.com","HTTP/1.0 504 Gateway Time-out"
"067","www.brandedcorner.com","HTTP/1.0 200 OK"
"068","www.gaga-diy.com","HTTP/1.0 504 Gateway Time-out"
"069","www.brand-0k.com","HTTP/1.0 200 OK"
"070","www.jjkopi.com","HTTP/1.0 200 OK"
"071","www.momocak.com","HTTP/1.0 200 OK"
"072","www.bags-ladies.com","HTTP/1.0 200 OK"
"073","www.2014monkureru.com","HTTP/1.0 200 OK"
"074","www.uggaustralia-jp.asia","HTTP/1.0 200 OK"
"075","www.esprogramming.com","HTTP/1.0 200 OK"
"076","www.jimmychooonlinejapan.com","HTTP/1.0 403 Forbidden"
"077","www.wristwatchsalerep.com","HTTP/1.0 504 Gateway Time-out"
"078","www.lvutt.com","HTTP/1.0 504 Gateway Time-out"
"079","www.vuittonfans.com","HTTP/1.0 200 OK"
"080","www.vuittonu.com","HTTP/1.0 504 Gateway Time-out"
"081","www.montblancshop.cc","HTTP/1.0 200 OK"
"082","www.rexvod.com","HTTP/1.0 522 Unknown"
"083","www.northlandtimeshares.com","HTTP/1.0 200 OK"
"084","www.coachjust.com","HTTP/1.0 403 Forbidden"
"085","www.jojoeheaven.org","HTTP/1.0 200 OK"
"086","www.iphone5casesjp.com","HTTP/1.0 200 OK"
"087","www.longchampukhandbags.com","HTTP/1.0 200 OK"
"088","www.bagsyo.com","HTTP/1.0 200 OK"
"089","www.palazzogrimani.org","HTTP/1.0 200 OK"
"090","www.shoppingmaniaoutlet.com","HTTP/1.0 200 OK"
"091","babygoodsshopoutlet.com","HTTP/1.0 504 Gateway Time-out"
"092","www.yasuishoppu.com","HTTP/1.0 504 Gateway Time-out"
"093","www.amandabeamer.com","HTTP/1.0 504 Gateway Time-out"
"094","www.jpzizi.com","HTTP/1.0 504 Gateway Time-out"
"095","www.boots1000.com","HTTP/1.0 403 Forbidden"
"096","www.buyuggcheap.com","HTTP/1.0 200 OK"
"097","www.shopmarmotjp.com","HTTP/1.0 200 OK"
"098","www.iphoneswitch.com","HTTP/1.0 200 OK"
"099","www.iphonesweep.com","HTTP/1.0 200 OK"
"100","www.scoopy-doos.com","HTTP/1.0 200 OK"
"101","www.outcomesfas.com","HTTP/1.0 504 Gateway Time-out"
"102","www.japanbaggu.com","HTTP/1.0 504 Gateway Time-out"
"103","www.shoppemallmarts.com","HTTP/1.0 504 Gateway Time-out"
"104","www.yazya.biz","HTTP/1.0 403 Forbidden"
"105","www.nikon-live.com","HTTP/1.0 504 Gateway Time-out"
"106","www.armani-jp.com","HTTP/1.0 504 Gateway Time-out"
"107","www.2013monkureru.com","HTTP/1.0 200 OK"
"108","www.burberryjapan.net","HTTP/1.0 200 OK"
"109","www.jagranados.com","HTTP/1.0 504 Gateway Time-out"
"110","www.felisihannbai.com","HTTP/1.0 504 Gateway Time-out"
"111","www.brandskiaccessories.com","HTTP/1.0 504 Gateway Time-out"
"112","www.bookingsatlas.com","HTTP/1.0 400 Bad Request"
"113","www.hermes2014.in","HTTP/1.0 200 OK"
"114","www.bag-hermes.com","HTTP/1.0 200 OK"
"115","www.coachwallets-jp.info","HTTP/1.0 200 OK"
"116","www.j5sf.com","HTTP/1.0 200 OK"
"117","www.golfonhome.com","HTTP/1.0 504 Gateway Time-out"
"118","www.jonpoor.com","HTTP/1.0 504 Gateway Time-out"
"119","kawaiyukinobutsu.com","HTTP/1.0 504 Gateway Time-out"
"120","www.coachbaguu.com","HTTP/1.0 504 Gateway Time-out"
"121","jpkutucharm.com","HTTP/1.0 504 Gateway Time-out"
"122","www.wholesalelouisvuitton-japan.biz","HTTP/1.0 504 Gateway Time-out"
"123","www.フェラガモアウトレット.com","HTTP/1.0 400 Bad Request"
"124","www.vipluxuryshoppevip.com","HTTP/1.0 504 Gateway Time-out"
"125","www.flighttocapetown.com","HTTP/1.0 200 OK"
"126","kushyma.com","HTTP/1.0 504 Gateway Time-out"
"127","www.antaraarts.org","HTTP/1.0 200 OK"
"128","www.love-bag.net","HTTP/1.0 504 Gateway Time-out"
"129","www.louis-vuittonsale.info","HTTP/1.0 200 OK"
"130","www.samanthasale.info","HTTP/1.0 504 Gateway Time-out"
"131","www.cineotro.com","HTTP/1.0 403 Forbidden"
"132","shoppemallmarts.com","HTTP/1.0 504 Gateway Time-out"
"133","kingsofshoes.com","HTTP/1.0 200 OK"
"134","www.sukigo.com","HTTP/1.0 200 OK"
"135","www.storegentenjpsale.com","HTTP/1.0 200 OK"
"136","www.burandofu8899.com","HTTP/1.0 200 OK"
"137","xn--jck7c4a7b0a9km64tjqzbdoc5r1k.jp","HTTP/1.0 200 OK"
"138","www.replicaunifomushop.com","HTTP/1.0 403 Forbidden"
"139","www.unifomushop.com","HTTP/1.0 404 Not Found"
"140","jp-store-hermes.com","HTTP/1.0 200 OK"
"141","www.rb-pickoutlet.com","HTTP/1.0 200 OK"
"142","brandshopsshopping.com","HTTP/1.0 200 OK"
"143","audiocentergt.com","HTTP/1.0 504 Gateway Time-out"
"144","www.johnstonsjp.com","HTTP/1.0 504 Gateway Time-out"
"145","www.bvlgarijapan.com","HTTP/1.0 200 OK"
"146","www.bvlgarijapan.info","HTTP/1.0 200 OK"
"147","www.chloebagsoutlet.info","HTTP/1.0 200 OK"
"148","www.lv-guccl88.com","HTTP/1.0 504 Gateway Time-out"
"149","superwatch1.com","HTTP/1.0 200 OK"
"150","www.watch9.com","HTTP/1.0 200 OK"
"151","www.kopigoods.com","HTTP/1.0 200 OK"
"152","www.maikiki.com","HTTP/1.0 504 Gateway Time-out"
"153","www.oneminuteplayfestival.org","HTTP/1.0 200 OK"
"154","www.guccipursejp.info","HTTP/1.0 200 OK"
"155","www.renewingfaces.com","HTTP/1.0 302 Moved Temporarily"
"156","www.topmalltrade-blng.com","HTTP/1.0 504 Gateway Time-out"
"157","www.japanbagskan.com","HTTP/1.0 504 Gateway Time-out"
"158","www.jpzizi.com","HTTP/1.0 504 Gateway Time-out"
"159","birukenbuyma.com","HTTP/1.0 200 OK"
"160","www.eakonshop.com","HTTP/1.0 504 Gateway Time-out"
"161","www.lkjy.net","HTTP/1.0 504 Gateway Time-out"
"162","www.chloesale.pw","HTTP/1.0 200 OK"
"163","teddyni.com","HTTP/1.0 200 OK"
"164","www.searchlik.com","HTTP/1.0 504 Gateway Time-out"
"165","aquatas.com","HTTP/1.0 504 Gateway Time-out"
"166","www.loveyoushoppu.com","HTTP/1.0 504 Gateway Time-out"
"167","www.americanpunjabitribune.com","HTTP/1.0 200 OK"
"168","www.ozkaltd.com","HTTP/1.0 200 OK"
"169","www.edu-huayu.com","HTTP/1.0 504 Gateway Time-out"
"170","www.webidx.net","HTTP/1.0 200 OK"
"171","musumenohigifuto.com","HTTP/1.0 504 Gateway Time-out"
"172","www.likekopi.com","HTTP/1.0 504 Gateway Time-out"
"173","www.renewingfaces.com","HTTP/1.0 200 OK"
"174","6fos.com","HTTP/1.0 200 OK"
"175","www.dinermuseum.com","HTTP/1.0 200 OK"
"176","www.jpfigure.com","HTTP/1.0 200 OK"
"177","www.3i09.com","HTTP/1.0 504 Gateway Time-out"
"178","www.xreaderapp.com","HTTP/1.0 504 Gateway Time-out"
"179","www.viviennebestjp.com","HTTP/1.0 200 OK"
"180","www.gentenshopjp.com","HTTP/1.0 200 OK"
"181","www.celineya.net","HTTP/1.0 200 OK"
"182","www.vuittonwaell.com","HTTP/1.0 200 OK"
"183","www.brandparis6.com","HTTP/1.0 200 OK"
"184","www.hicopys.net","HTTP/1.0 200 OK"
"185","www.toryburchoutlet.info","HTTP/1.0 404 Not Found"
"186","www.onyxalliance.org","HTTP/1.0 200 OK"
"187","www.csmmetalmart.com","HTTP/1.0 200 OK"
"188","www.avongouis.com","HTTP/1.0 200 OK"
"189","www.shoppefaithsvip.com","HTTP/1.0 504 Gateway Time-out"
"190","www.youthshopcentrejp.com","HTTP/1.0 504 Gateway Time-out"
"191","www.jp-louisvuitton.com","HTTP/1.0 200 OK"
"192","www.bandfashions.com","HTTP/1.0 504 Gateway Time-out"
"193","www.jokojp.com","HTTP/1.0 200 OK"
"194","www.cnboee.com","HTTP/1.0 200 OK"
"195","www.vuittonbox.com","HTTP/1.0 200 OK"
"196","www.bestcopys.com","HTTP/1.0 200 OK"
"197","www.buyna.net","HTTP/1.0 200 OK"
"198","www.gucchiya.com","HTTP/1.0 504 Gateway Time-out"
"199","www.pradabagsale.info","HTTP/1.0 200 OK"
"200","burberrypolo.eu","HTTP/1.0 504 Gateway Time-out"
"201","www.lcmbilgimerkezi.com","HTTP/1.0 504 Gateway Time-out"
"202","www.caycecole.com","HTTP/1.0 403 Forbidden"
"203","www.ecznuons.com","HTTP/1.0 504 Gateway Time-out"
"204","tumi.wsualumuicard.com","HTTP/1.0 504 Gateway Time-out"
"205","cheap-flightsair.com","HTTP/1.0 504 Gateway Time-out"
"206","www.icoolauto.com","HTTP/1.0 504 Gateway Time-out"
"207","www.jphareru.com","HTTP/1.0 200 OK"
"208","genuinelouisvuitton.com","HTTP/1.0 200 OK"
"209","www.urbntouch.com","HTTP/1.0 200 OK"
"210","www.4pixsake.com","HTTP/1.0 200 OK"
"211","www.uthotels.biz","HTTP/1.0 200 OK"
"212","www.kurvstudios.net","HTTP/1.0 200 OK"
"213","www.educatlvosmultimedia.com","HTTP/1.0 504 Gateway Time-out"
"214","www.stylesfashionstore.com","HTTP/1.0 504 Gateway Time-out"
"215","www.flagshipband.com","HTTP/1.0 200 OK"
"216","www.asazunet.com","HTTP/1.0 200 OK"
"217","www.rackanshop.pw","HTTP/1.0 504 Gateway Time-out"
"218","www.myrcmd.pw","HTTP/1.0 504 Gateway Time-out"
"219","www.aldantownwatch.com","HTTP/1.0 200 OK"
"220","www.urbntouch.com","HTTP/1.0 200 OK"
"221","www.nksclub.com","HTTP/1.0 200 OK"
"222","www.dragontechs.com","HTTP/1.0 200 OK"
"223","cynergyper4mers.com","HTTP/1.0 200 OK"
"224","www.efgkbp.com","HTTP/1.0 200 OK"
"225","www.mamairma.info","HTTP/1.0 200 OK"
"226","www.seecurrent.asia","HTTP/1.0 504 Gateway Time-out"