How to Enable IPv6 Access on Alibaba Cloud ECS

By default, the CentOS image on Alibaba Cloud comments out IPv6. Enabling it requires using a dual-stack IPv4 and IPv6 network, which needs to be applied for under beta testing—this isn’t very convenient. Instead, we can use an IPv6 tunnel provided by tunnelbroker.net to enable IPv6 access. Also, if you’re on campus and using China Telecom, you can get an IPv6 address directly. So, if the Alibaba Cloud server supports IPv6, you can browse the web for free.

1. Enable IPv6 Access

vi /etc/sysctl.conf

Uncomment the following three lines and change the value from 1 to 0, as shown below:

ipv6

Then reload the configuration with:

sysctl -p

IPv6 should now be supported.

2. Get an IPv6 Address via Tunnelbroker

Go to https://tunnelbroker.net and register for an account. Make sure your password is complex enough or the registration may fail.

After logging in, go to the bottom left and select “Create Regular Tunnel”:

……

阅读全文

如何给阿里云ecs开启ipv6访问

阿里云centos镜像默认是把ipv6给注释掉的,如果要开启的话需要使用IPv4 和 IPv6 双栈网络,这需要申请公测资格,不是很方便,这里我们使用tunnelbroker提供的ipv6隧道来使其支持ipv6访问。而学校电信是可以直接获取ipv6地址的,如果阿里云服务器可以ipv6访问的话就可以免费上网了。

……

阅读全文

python中的序列化与反序列化

有时候需要临时将数据存储起来,方便下次运行程序时可以直接调用,或者不同线程之间交换数据都是可以用序列化的方式把数据存储>起来,然后调用,这里我们以pickle包来解释python中的序列化与反序列化……

阅读全文

Serialization and Deserialization in Python

Sometimes you may need to temporarily store data so that it can be called directly the next time the program runs, or exchanged between different threads. Serialization is a way to store data in this way, and here we explain Python’s serialization and deserialization using the pickle package.……

阅读全文

Build a Google Mirror Site Using Docker

Due to a new environment where the router does not support installing Shadowsocks or V2Ray, accessing Google to search vast technical content in English is not possible. Here, we use the official Google mirror container to build a Google mirror site and map it to our existing domain.

Requirements:

  1. A VPS such as Vultr, etc.
  2. A domain name. In this case, we use google.bobobk.com as the Google mirror domain.

I chose an Amazon VPS. Since I couldn’t find a good CentOS image, I used Ubuntu as the operating system.

Steps to build the mirror site:

  1. Set domain DNS to point to your VPS
  2. Add the site to your nginx server. I used the BT (BaoTa) panel, which is quite convenient.
  3. Modify the config file to set up a reverse proxy to Docker’s mapped port

1. Set domain resolution

Since I use Cloudflare CDN, I’ll use it as an example.

……

阅读全文

利用docker容器技术搭建Google镜像

由于新环境路由器不具备安装酸酸或v2的条件,在路由器上实现上google搜索海量英文技术内容就不行了,我们这里采用官方的google镜像容器做一个谷-歌的镜像网站并映射到自己已有的域名上。

搭建条件:

1.vultr等各种VPS 2.有个域名,这里我们使用google.bobobk.com作为谷歌镜像的域名站

……

阅读全文

Extracting Free High Anonymity Proxies with Python3

Writing web crawlers often leads to problems like IP bans or rate limits. Having an efficient IP proxy pool is quite important. Here, we introduce how to extract valid IPs from public proxy sources and build your own efficient crawler proxy pool.

Main Modules:

  1. Use requests to crawl proxies
  2. Update and check available proxies

Crawling Proxies with requests, using xici as an example

Anonymous proxy page: xici, inspect elements.

xici

Each proxy is contained in a tr under the element with id ip_list, and detailed info is under td. Therefore, the CSS selector can be
content.css("#ip_list").css(“tr”), then extract the 1st and 6th elements.
Later, we add IP availability checking logic, and store successful ones into a JSON file. After that, available proxy information can be accessed via HTTP.

#!/root/anaconda3/bin/python

from scrapy.selector import Selector
import redis
import requests
import json
import requests
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
import random

def get_headers():
    USER_AGENT_LIST = [
    'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; 360SE)',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36',
    'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E; 360SE)'
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/18.17763',
    '"Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0',
    ]
    USER_AGENT = random.choice(USER_AGENT_LIST)
    return {'User-Agent':USER_AGENT}

def get_random_proxy():
    https_pro=[i for i in pro if "https" in i]
    if len(https_pro)==0:
        return None
    else:
        return https_pro[random.randint(0,len(https_pro))] 

def crawl_ip():
    for i in range(5):
        rand_ip = get_random_proxy()
        if rand_ip:
            r =requests.get('https://www.xicidaili.com/nn/{}'.format(str(i+1)),headers=get_headers())
        else:
            r =requests.get('https://www.xicidaili.com/nn/{}'.format(str(i+1)),headers=get_headers(),proxies=proxies_ip(rand_ip))
        content = Selector(r)
        ip_list = content.css("#ip_list").css("tr")
        for i in ip_list[1:]:
            info = i.css("td::text").extract()
            ip = info[0]
            protoco = info[5].strip().lower()
            if protoco=="http" or protoco=="https":
                url = protoco + '://' + ip + ':' + info[1] 
            else:
                url = 'http://' + ip + ':' + info[1]
            validate_ip(url)

def proxies_ip(url):
    if 'https' not in url:
        proxies={'http':url}
    else:
        proxies={'https':url}
    return proxies

def validate_ip(url):
    proxies = proxies_ip(url)
    if url not in pro:
        bobo_url=http_url
        if "https" in url:
            bobo_url=https_url
        try:
            r = requests.get(bobo_url, headers=get_headers(), proxies=proxies, timeout=1)
            pro.append(url)
            print('ip %s validated' % url)
        except Exception as e:
            print('cant check ip %s' % url)

def check_current_ip(): # Update and check usable proxies
    curr = open(JSON_PATH).read()
    if curr!='':
        for url in json.loads(open(JSON_PATH).read()):
            validate_ip(url)

if __name__ =='__main__':
    http_url = "http://www.bobobk.com"
    https_url = "https://www.bobobk.com"
    pro = []
    TXT_PATH = '/www/wwwroot/default/daili.txt'
    JSON_PATH='/www/wwwroot/default/daili.json'
    PROXYCHAIN_CONF='/www/wwwroot/default/proxy.conf'
    check_current_ip()
    crawl_ip()

    with open(JSON_PATH,'w') as fw:
        fw.write(json.dumps(list(set(pro))))
    fw.close()

    with open(TXT_PATH,'w') as fw:
        for i in set(pro):
            fw.write(i+"n")
    fw.close()

Update and Check Usable Proxies

Before each page fetch, the script checks for usable proxies, and automatically uses them to fetch new ones. This setup can run stably.

……

阅读全文

python3提取免费高匿代理

写爬虫总是免不了被ban  ip,限制流量等问题,有个高效的ip代理池还是很重要的,这里我们就介绍如何从已有公开代理总提取有效ip组建自己高效的爬虫代理池

主要以下几个模块

1.requests爬取代理 2.更新检测可用代理

requests爬取代理,选取xici代理为例

高匿网页地址xici, 检查元素

xici

元素选择就是id为ip_list的tr为一个一个的代理,td下为详细信息,因此css选择器就可以为 content.css("#ip_list").css(“tr”) ,然后在提取1,6项即可,后续加入判断ip可用性选项,判断成功后存入json文件,以后就可以通过http形式获取到可用的代理信息。

……

阅读全文

Python 的这几个技巧分享

尽管本人已经使用Python编程有多年了,今天仍然惊奇于这种语言所能让代码表现出的整洁和对DRY编程原则的适用。这些年来的经历让我学到了很多的小技巧和知识,大多数是通过阅读很流行的开源软件,如Django, Flask,Requests中获得的。

下面我挑选出的这几个技巧常常会被人们忽略,但它们在日常编程中能真正的给我们带来不少帮助。

……

阅读全文

Sharing These Python Tips

Despite having programmed in Python for many years, I’m still amazed by how clean the code can be and how well it adheres to the DRY (Don’t Repeat Yourself) programming principle. My experience over the years has taught me many small tricks and pieces of knowledge, mostly gained from reading popular open-source software like Django, Flask, and Requests.

Here are a few tips I’ve picked out that are often overlooked, but can genuinely help us in daily programming.


1. Dictionary Comprehensions and Set Comprehensions

Most Python programmers know and use list comprehensions. If you’re not familiar with the concept of list comprehensions, it’s a shorter, more concise way to create a list.

>>> some_list = [1, 2, 3, 4]

>>> another_list = [ x + 1 for x in some_list ]

>>> another_list
[2, 3, 4, 5]

Since Python 3, we can use the same syntax to create sets and dictionaries:

……

阅读全文

最近文章

分类

标签

友情链接

其它