Bash
1 前言
最近WordPress安装“Google XML Sitemaps”插件,然而使用“百度资源搜索平台”提交却提示不接受Sitemaps索引列表,于是产生自己写脚本生成的想法。
关于本站的索引列表,详细请参阅如下链接,
https://www.cmdschool.org/sitemap.xml
2 最佳实践
2.1 获取推送接口
关于“百度资源搜索平台”的操作菜单,请访问以下链接,
https://ziyuan.baidu.com/linksubmit/index
页面显示如下,

如上图所示,
单击【资源提交】-> 单击【API提交】即可查询你的推送接口(注意参数“site”和“token”,一会需要使用)
2.2 创建处理脚本
mkdir ~/scripts/ vim ~/scripts/pushUrls-baidu.sh
加入如下脚本,
#!/bin/bash
siteMapListPath="/tmp/sitemap-baidu.list"
pushUrlsPath="/tmp/pushUrls.txt"
newUrlsPath="/tmp/newhUrls.txt"
oldUrlsPath="/tmp/oldUrls.txt"
exeLog="/var/log/pushUrls.log"
dataTime="`date +'%Y-%m-%d %H:%M:%S'`"
site="https://www.cmdschool.org"
token="XXXXXXXXXXXXXXXX"
if [ -f $newUrlsPath ]; then
cat /dev/null > $newUrlsPath
fi
if [ -f $pushUrlsPath ]; then
cat /dev/null > $pushUrlsPath
fi
if [ ! -f $oldUrlsPath ]; then
cat /dev/null > $oldUrlsPath
fi
curl -s https://www.cmdschool.org/sitemap.xml | grep "loc" | grep ".xml" | sed -e "s///g" -e "s///g" -e "s/^[ \t]*//g" > $siteMapListPath
for i in `cat $siteMapListPath`; do
curl -s $i | grep "" | sed -e 's/^[ \t]*//g' -e 's///g' -e 's///g' >> $newUrlsPath
done
diff $newUrlsPath $oldUrlsPath | grep "https" | grep "<" | sed -e 's/> $pushUrlsPath
if [ `cat $pushUrlsPath | wc -l` -eq "0" ]; then
echo $dataTime' No push required!' | tee -a $exeLog
exit 0
fi
cd `dirname $pushUrlsPath`
execMess=$(curl -H 'Content-Type:text/plain' --data-binary @`basename $pushUrlsPath` "http://data.zz.baidu.com/urls?site=$site&token=$token")
if [ `echo $execMess | grep "success" | wc -l` -eq "1" ]; then
echo $dataTime' Below link push is completed,' | tee -a $exeLog
for i in `cat $pushUrlsPath`; do
echo $i | tee -a $exeLog
done
echo $execMess | tee -a $exeLog
cat $newUrlsPath > $oldUrlsPath
exit 0
else
echo $dataTime' Execution exits unexpectedly' | tee -a $exeLog
fi
以下变量需要你根据自己的实际情况修改,
– 变量“site”,定义站点的链接,本例值为“https://www.cmdschool.org”
– 变量“token”,定义令牌密码,由百度提供,页面可查,本例值为“XXXXXXXXXXXXXXXX”
编辑完成建议使用如下命令测试,
sh ~/scripts/pushUrls-baidu.sh
执行完毕后,可使用如下命令查看推送的日志,
tail -f /var/log/pushUrls.log
可见如下显示,
2020-12-11 09:55:18 Below link push is completed,
https://www.cmdschool.org/archives/1232
{"remain":99603,"success":1}
最后一行意思为,
– 参数“remain”表示当天还剩余推送的条数,本例是“99603”条
– 参数“success”表示当天推送的条数,本例是“1”条
2.3 添加计划任务触发执行
crontab -e
然后加入如下配置,
*/30 * * * * sh ~/scripts/pushUrls-baidu.sh
没有评论