Enable DNSSEC for Aliyun Domain to Prevent DNS Hijacking

2019年8月13日 | Miscellaneous

While using Cloudflare daily, I found the free DNSSEC under DNS was not enabled. How can that be? Let’s first see what DNSSEC is.……

阅读全文

Use Cloudflare Workers to Free Accelerate WordPress Blog

2019年8月5日 | Miscellaneous

To improve user experience, websites always try to speed up loading times at all costs. Google launched AMP, Baidu launched MIP, but both require deploying a separate set of website code, which is complicated and increases development costs.……

阅读全文

How to fix Google AdSense warnings about revenue loss risk due to ads.txt issues

2019年8月1日 | Miscellaneous

Recently, AdSense has been warning about revenue loss risk - you need to fix some ads.txt file issues to avoid serious income damage. Although it’s just a small amount, seeing this prompt makes it necessary to fix it properly.……

阅读全文

Nginx reverse proxy TCP/UDP requests to map remote servers

2019年7月26日 | Miscellaneous

Nginx is a high-performance HTTP server and reverse proxy server, as well as an IMAP/POP3/SMTP server. Since version 1.9.13, Nginx supports port forwarding.……

阅读全文

Install VNC Server on Alibaba Cloud centos7 for Graphical Access

2019年7月20日 | Miscellaneous

This guide shows how to install TigerVNC and a desktop environment on CentOS 7 running on Alibaba Cloud ECS, then verify the VNC service and open only the required security-group rule.……

阅读全文

Using Google Chrome to Test Interface Techniques

2019年7月17日 | Miscellaneous

When writing web scrapers, manually modifying headers and cookies often causes headaches and errors. Here, a very convenient method using Chrome’s built-in tools to generate Python requests is introduced.……

阅读全文

10 Tips to Improve Your Python Data Analysis Skills

2019年7月8日 | Technology

This article collects 10 practical tips for Python data analysis and Jupyter Notebook workflows, including dataset profiling, interactive plotting, notebook magics, debugging, and a few time-saving shortcuts.……

阅读全文

How to Enable IPv6 Access on Alibaba Cloud ECS

2019年7月5日 | Miscellaneous

This guide shows how to enable IPv6 on Alibaba Cloud ECS with Tunnelbroker, including the kernel settings, tunnel setup, connectivity checks, and the Nginx listener changes required to serve traffic over IPv6.……

阅读全文

Using Soft-EtherVPN to Set Up OpenVPN to Bypass Webpage Authentication

2019年7月3日 | Miscellaneous

Campus network requires phone number authentication, but since SIM cards are expensive and discontinued, the login page redirects to recharge. Although free IPv6 is available, most of the internet lacks IPv6 access capability.……

阅读全文

Serialization and Deserialization in Python

2019年7月1日 | Technology

This article explains the basics of Python serialization and deserialization with pickle, including file-based examples, dumps() and loads(), and one important safety warning for real-world use.……

阅读全文

Build a Google Mirror Site Using Docker

2019年6月27日 | Miscellaneous

Due to a new environment where the router does not support installing Shadowsocks or V2Ray, accessing Google to search vast technical content in English is not possible. Here, we use the official Google mirror container to build a Google mirror site and map it to our existing domain.

Requirements:

A VPS such as Vultr, etc.
A domain name. In this case, we use google.bobobk.com as the Google mirror domain.

I chose an Amazon VPS. Since I couldn’t find a good CentOS image, I used Ubuntu as the operating system.

Steps to build the mirror site:

Set domain DNS to point to your VPS
Add the site to your nginx server. I used the BT (BaoTa) panel, which is quite convenient.
Modify the config file to set up a reverse proxy to Docker’s mapped port

1. Set domain resolution

Since I use Cloudflare CDN, I’ll use it as an example.

……

阅读全文

Extracting Free High Anonymity Proxies with Python3

2019年6月25日 | Miscellaneous

Writing web crawlers often leads to problems like IP bans or rate limits. Having an efficient IP proxy pool is quite important. Here, we introduce how to extract valid IPs from public proxy sources and build your own efficient crawler proxy pool.

Main Modules:

Use requests to crawl proxies
Update and check available proxies

Crawling Proxies with `requests`, using xici as an example

Anonymous proxy page: xici, inspect elements.

xici

Each proxy is contained in a tr under the element with id ip_list, and detailed info is under td. Therefore, the CSS selector can be
content.css("#ip_list").css(“tr”), then extract the 1st and 6th elements.
Later, we add IP availability checking logic, and store successful ones into a JSON file. After that, available proxy information can be accessed via HTTP.

#!/root/anaconda3/bin/python

from scrapy.selector import Selector
import redis
import requests
import json
import requests
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
import random

def get_headers():
    USER_AGENT_LIST = [
    'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; 360SE)',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36',
    'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E; 360SE)'
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/18.17763',
    '"Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0',
    ]
    USER_AGENT = random.choice(USER_AGENT_LIST)
    return {'User-Agent':USER_AGENT}

def get_random_proxy():
    https_pro=[i for i in pro if "https" in i]
    if len(https_pro)==0:
        return None
    else:
        return https_pro[random.randint(0,len(https_pro))] 

def crawl_ip():
    for i in range(5):
        rand_ip = get_random_proxy()
        if rand_ip:
            r =requests.get('https://www.xicidaili.com/nn/{}'.format(str(i+1)),headers=get_headers())
        else:
            r =requests.get('https://www.xicidaili.com/nn/{}'.format(str(i+1)),headers=get_headers(),proxies=proxies_ip(rand_ip))
        content = Selector(r)
        ip_list = content.css("#ip_list").css("tr")
        for i in ip_list[1:]:
            info = i.css("td::text").extract()
            ip = info[0]
            protoco = info[5].strip().lower()
            if protoco=="http" or protoco=="https":
                url = protoco + '://' + ip + ':' + info[1] 
            else:
                url = 'http://' + ip + ':' + info[1]
            validate_ip(url)

def proxies_ip(url):
    if 'https' not in url:
        proxies={'http':url}
    else:
        proxies={'https':url}
    return proxies

def validate_ip(url):
    proxies = proxies_ip(url)
    if url not in pro:
        bobo_url=http_url
        if "https" in url:
            bobo_url=https_url
        try:
            r = requests.get(bobo_url, headers=get_headers(), proxies=proxies, timeout=1)
            pro.append(url)
            print('ip %s validated' % url)
        except Exception as e:
            print('cant check ip %s' % url)

def check_current_ip(): # Update and check usable proxies
    curr = open(JSON_PATH).read()
    if curr!='':
        for url in json.loads(open(JSON_PATH).read()):
            validate_ip(url)

if __name__ =='__main__':
    http_url = "http://www.bobobk.com"
    https_url = "https://www.bobobk.com"
    pro = []
    TXT_PATH = '/www/wwwroot/default/daili.txt'
    JSON_PATH='/www/wwwroot/default/daili.json'
    PROXYCHAIN_CONF='/www/wwwroot/default/proxy.conf'
    check_current_ip()
    crawl_ip()

    with open(JSON_PATH,'w') as fw:
        fw.write(json.dumps(list(set(pro))))
    fw.close()

    with open(TXT_PATH,'w') as fw:
        for i in set(pro):
            fw.write(i+"n")
    fw.close()

Update and Check Usable Proxies

Before each page fetch, the script checks for usable proxies, and automatically uses them to fetch new ones. This setup can run stably.

……

阅读全文

1024 adsense agents.md ai ai-agent ai-seo algorithm amp automation background bioinformatics bootstrapping boxes c-index cca cdn cli cloudflare codex copy cpu monitoring cuda datascience datavisualization devtools disown docker dovecot download economics fable faceswap fastmcp ffmpeg flask folium frontend game generator git google grep hls hugo indexnow javascript json k-means kaggle leecode linux list litellm llm llms.txt logs lollipop m3u8 manacher matplotlib mcp mirror mp3 mp4 multiomics mutation mysql networkx nginx normalize password pep-723 phaser pillow pip postfix preprocessing print probability proxy pyecharts pyqt python python3 r raincloud requests rg ripgrep roundcube s-tui sampling scale scrapy screen seaborn security seo sklearn solana somaticsignatures spl standardize tensorflow trade tron tronpy turtle usdt uv vite wallet webp wordcloud wordpress workflow yaml 迅雷解析

友情链接

AI
Finance

其它

文章 RSS

春江暮客

Enable DNSSEC for Aliyun Domain to Prevent DNS Hijacking

Use Cloudflare Workers to Free Accelerate WordPress Blog

How to fix Google AdSense warnings about revenue loss risk due to ads.txt issues

Nginx reverse proxy TCP/UDP requests to map remote servers

Install VNC Server on Alibaba Cloud centos7 for Graphical Access

Using Google Chrome to Test Interface Techniques

10 Tips to Improve Your Python Data Analysis Skills

How to Enable IPv6 Access on Alibaba Cloud ECS

Using Soft-EtherVPN to Set Up OpenVPN to Bypass Webpage Authentication

Serialization and Deserialization in Python

Build a Google Mirror Site Using Docker

Requirements:

I chose an Amazon VPS. Since I couldn’t find a good CentOS image, I used Ubuntu as the operating system.

1. Set domain resolution

Extracting Free High Anonymity Proxies with Python3

Main Modules:

Crawling Proxies with `requests`, using xici as an example

Update and Check Usable Proxies

最新文章

分类

标签

友情链接

其它

Requirements:

I chose an Amazon VPS. Since I couldn’t find a good CentOS image, I used Ubuntu as the operating system.

1. Set domain resolution

Main Modules:

Crawling Proxies with requests, using xici as an example

Update and Check Usable Proxies

最新文章

分类

标签

友情链接

其它

Crawling Proxies with `requests`, using xici as an example