web関連

【php】外部サイトURLからmetaタイトルを抜き取ろうとしたときの備忘録

2020/05/12

プログラミング

php

外部サイトからmetaタイトルを取得しようと調べたら、思ったより方法が多かったので使い方の備忘録

URLから文字列に読み込むfile_get_contents関数

書き方としてはとてもシンプルで使い勝手はよかった
引数に何も指定しないと全部読み込もうとするけど、第五引数に読み込みの最大バイト数を指定できて多少読み込み速度とかに配慮をすることができる印象
※第四引数はどこから読み込みを始めるかのオフセット値

<?php
$url = "https://twotone.me/";
$html = file_get_contents($url,null,null,0,1024);
echo $html;
?>

Link：PHPのfile_get_contentsでファイル・URLの情報を取得する

Memo：参考

Link：【php】URLから外部サイトtitleを取得してリンクを生成するショートコード

Memo：書いた備忘録

file_get_contentsより読み込みが早いcURL関数

file_get_contentsより早くて、curl_setoptというオプションが色々用意されている
HTTPリクエスト(HTTPレスポンス)の情報を扱えるようになるみたい

バイト数制限とか調べたけど、見つからなかった

<?php
$url = "https://twotone.me/";
$ch = curl_init();//初期化
curl_setopt($ch, CURLOPT_URL, $url);//URLの指定
curl_setopt($ch, CURLOPT_HEADER, false);//ヘッダーの有無
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);//データを文字列に変換
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);//SSL証明書の検証
curl_setopt($ch, CURLOPT_TIMEOUT, 30);//タイムアウトする時間
$html = curl_exec($ch);//処理実行
curl_close($ch);//処理終了
echo $html;
?>

Link：PHPのfile_get_contentsをcURLへ置き換える

Memo：file_get_contentsとcURLの違いとか注意事項について書かれてた

Link：【php】cURLを使って外部サイトtitleを取得してリンクを生成するショートコード

Memo：書いた備忘録

一行づつ読み込むfgets関数

ファイルを1行づつwhile文で読み込む方法
他と違ってファイルへのアクセス方法(r=読み込み、w=書き出し)だとか指定する必要がある

<?php
$url = "https://twotone.me/";
// ファイルを開く
$file = fopen($url, "r");
// URLが読み込み可能ならファイルを一行づつ読み込み
if($file){
  while ($line = fgets($file)) {
  	echo $line;
  }
}
// ファイルを閉じる
fclose($file);
?>

Link：【PHP関数】fgetsによるファイル操作

Memo：まとめてあってわかりやすかった

Link：【PHP】fgets()を使って外部サイトのmetaタイトルを取得する

Memo：metaタイトルを取得するときにこれだったら、metaタイトルが見つかった時点で処理を中断かけられるから少しは早いのかなって思った

fgets関数触ってて思ったんだけど、開くファイルのソースコードが圧縮されてたりすると1行が凄い長くなって読み込むサイトによってはcURLのほうが早いんじゃないかな？って思うこともあった

複数のファイルを非同期で読み込むことができるcurl_multi

cURLを使って効率よく複数のファイルを読み込む仕組みみたい
使ってみようと思ったんだけど、色々あって結局使わなかったので参考記事だけ乗っけとく

Link：curl_multiは確かに爆速だった（がcurl_multi_selectのバグでハマった）

Link：curl_multiでHTTP並行リクエストを行うサンプル

ふと外部サイトからmetaタイトルを取得する方法調べたけど色々あって勉強になった get_meta_tags($url);でmeta情報は取得できるんだけどtitleタグは拾わないみたいで、惜しい関数なんかも用意されているのを知ってまだまだ知らないことが多いいなぁと感じた

Leave a Comment コメントをキャンセル

入力エリアすべてが必須項目です。メールアドレスが公開されることはありません。

内容をご確認の上、送信してください。

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

Warning: Trying to access array offset on value of type bool in /home/twotone/twotone.me/public_html/wp-content/plugins/siteguard/really-simple-captcha/siteguard-really-simple-captcha.php on line 353

CAPTCHA

上に表示された文字を入力してください。