How to Enable IPv6 Access on Alibaba Cloud ECS

2019年7月5日

| Miscellaneous

By default, the CentOS image on Alibaba Cloud comments out IPv6. Enabling it requires using a dual-stack IPv4 and IPv6 network, which needs to be applied for under beta testing—this isn’t very convenient. Instead, we can use an IPv6 tunnel provided by tunnelbroker.net to enable IPv6 access. Also, if you’re on campus and using China Telecom, you can get an IPv6 address directly. So, if the Alibaba Cloud server supports IPv6, you can browse the web for free.

1. Enable IPv6 Access

vi /etc/sysctl.conf

Uncomment the following three lines and change the value from 1 to 0, as shown below:

ipv6

Then reload the configuration with:

sysctl -p

IPv6 should now be supported.

2. Get an IPv6 Address via Tunnelbroker

Go to https://tunnelbroker.net and register for an account. Make sure your password is complex enough or the registration may fail.

After logging in, go to the bottom left and select “Create Regular Tunnel”:

……

阅读全文

如何给阿里云ecs开启ipv6访问

2019年7月5日

| 杂谈

阿里云centos镜像默认是把ipv6给注释掉的，如果要开启的话需要使用IPv4 和 IPv6 双栈网络，这需要申请公测资格，不是很方便，这里我们使用tunnelbroker提供的ipv6隧道来使其支持ipv6访问。而学校电信是可以直接获取ipv6地址的，如果阿里云服务器可以ipv6访问的话就可以免费上网了。

……

阅读全文

Using Soft-EtherVPN to Set Up OpenVPN to Bypass Webpage Authentication

2019年7月3日

| Miscellaneous

Campus network requires phone number authentication, but since SIM cards are expensive and discontinued, the login page redirects to recharge. Although free IPv6 is available, most of the internet lacks IPv6 access capability.……

阅读全文

使用Soft-EtherVP-N搭建openvpn绕过网页认证

2019年7月3日

| 杂谈

校园网需要使用手机号认证，但是由于手机卡较贵，已经停止使用，在登录后发现网页跳转到充值页面。虽然有免费的ipv6，但是绝大多数互联网有不具备ipv6的访问功能……

阅读全文

python中的序列化与反序列化

2019年7月1日

| 技术

有时候需要临时将数据存储起来，方便下次运行程序时可以直接调用，或者不同线程之间交换数据都是可以用序列化的方式把数据存储>起来，然后调用，这里我们以pickle包来解释python中的序列化与反序列化……

阅读全文

Serialization and Deserialization in Python

2019年7月1日

| Technology

Sometimes you may need to temporarily store data so that it can be called directly the next time the program runs, or exchanged between different threads. Serialization is a way to store data in this way, and here we explain Python’s serialization and deserialization using the pickle package.……

阅读全文

Build a Google Mirror Site Using Docker

2019年6月27日

| Miscellaneous

Due to a new environment where the router does not support installing Shadowsocks or V2Ray, accessing Google to search vast technical content in English is not possible. Here, we use the official Google mirror container to build a Google mirror site and map it to our existing domain.

Requirements:

A VPS such as Vultr, etc.
A domain name. In this case, we use google.bobobk.com as the Google mirror domain.

I chose an Amazon VPS. Since I couldn’t find a good CentOS image, I used Ubuntu as the operating system.

Steps to build the mirror site:

Set domain DNS to point to your VPS
Add the site to your nginx server. I used the BT (BaoTa) panel, which is quite convenient.
Modify the config file to set up a reverse proxy to Docker’s mapped port

1. Set domain resolution

Since I use Cloudflare CDN, I’ll use it as an example.

……

阅读全文

利用docker容器技术搭建Google镜像

2019年6月27日

| 杂谈

由于新环境路由器不具备安装酸酸或v2的条件，在路由器上实现上google搜索海量英文技术内容就不行了，我们这里采用官方的google镜像容器做一个谷-歌的镜像网站并映射到自己已有的域名上。

搭建条件：

1.vultr等各种VPS 2.有个域名，这里我们使用google.bobobk.com作为谷歌镜像的域名站

……

阅读全文

Extracting Free High Anonymity Proxies with Python3

2019年6月25日

| Miscellaneous

Writing web crawlers often leads to problems like IP bans or rate limits. Having an efficient IP proxy pool is quite important. Here, we introduce how to extract valid IPs from public proxy sources and build your own efficient crawler proxy pool.

Main Modules:

Use requests to crawl proxies
Update and check available proxies

Crawling Proxies with `requests`, using xici as an example

Anonymous proxy page: xici, inspect elements.

xici

Each proxy is contained in a tr under the element with id ip_list, and detailed info is under td. Therefore, the CSS selector can be
content.css("#ip_list").css(“tr”), then extract the 1st and 6th elements.
Later, we add IP availability checking logic, and store successful ones into a JSON file. After that, available proxy information can be accessed via HTTP.

#!/root/anaconda3/bin/python

from scrapy.selector import Selector
import redis
import requests
import json
import requests
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
import random

def get_headers():
    USER_AGENT_LIST = [
    'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; 360SE)',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36',
    'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E; 360SE)'
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/18.17763',
    '"Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0',
    ]
    USER_AGENT = random.choice(USER_AGENT_LIST)
    return {'User-Agent':USER_AGENT}

def get_random_proxy():
    https_pro=[i for i in pro if "https" in i]
    if len(https_pro)==0:
        return None
    else:
        return https_pro[random.randint(0,len(https_pro))] 

def crawl_ip():
    for i in range(5):
        rand_ip = get_random_proxy()
        if rand_ip:
            r =requests.get('https://www.xicidaili.com/nn/{}'.format(str(i+1)),headers=get_headers())
        else:
            r =requests.get('https://www.xicidaili.com/nn/{}'.format(str(i+1)),headers=get_headers(),proxies=proxies_ip(rand_ip))
        content = Selector(r)
        ip_list = content.css("#ip_list").css("tr")
        for i in ip_list[1:]:
            info = i.css("td::text").extract()
            ip = info[0]
            protoco = info[5].strip().lower()
            if protoco=="http" or protoco=="https":
                url = protoco + '://' + ip + ':' + info[1] 
            else:
                url = 'http://' + ip + ':' + info[1]
            validate_ip(url)

def proxies_ip(url):
    if 'https' not in url:
        proxies={'http':url}
    else:
        proxies={'https':url}
    return proxies

def validate_ip(url):
    proxies = proxies_ip(url)
    if url not in pro:
        bobo_url=http_url
        if "https" in url:
            bobo_url=https_url
        try:
            r = requests.get(bobo_url, headers=get_headers(), proxies=proxies, timeout=1)
            pro.append(url)
            print('ip %s validated' % url)
        except Exception as e:
            print('cant check ip %s' % url)

def check_current_ip(): # Update and check usable proxies
    curr = open(JSON_PATH).read()
    if curr!='':
        for url in json.loads(open(JSON_PATH).read()):
            validate_ip(url)

if __name__ =='__main__':
    http_url = "http://www.bobobk.com"
    https_url = "https://www.bobobk.com"
    pro = []
    TXT_PATH = '/www/wwwroot/default/daili.txt'
    JSON_PATH='/www/wwwroot/default/daili.json'
    PROXYCHAIN_CONF='/www/wwwroot/default/proxy.conf'
    check_current_ip()
    crawl_ip()

    with open(JSON_PATH,'w') as fw:
        fw.write(json.dumps(list(set(pro))))
    fw.close()

    with open(TXT_PATH,'w') as fw:
        for i in set(pro):
            fw.write(i+"n")
    fw.close()

Update and Check Usable Proxies

Before each page fetch, the script checks for usable proxies, and automatically uses them to fetch new ones. This setup can run stably.

……

阅读全文

python3提取免费高匿代理

2019年6月25日

| 杂谈

写爬虫总是免不了被ban ip，限制流量等问题，有个高效的ip代理池还是很重要的，这里我们就介绍如何从已有公开代理总提取有效ip组建自己高效的爬虫代理池

主要以下几个模块

1.requests爬取代理 2.更新检测可用代理

requests爬取代理，选取xici代理为例

高匿网页地址xici，检查元素

xici

元素选择就是id为ip_list的tr为一个一个的代理，td下为详细信息，因此css选择器就可以为 content.css("#ip_list").css(“tr”) ，然后在提取1，6项即可,后续加入判断ip可用性选项，判断成功后存入json文件，以后就可以通过http形式获取到可用的代理信息。

……

阅读全文

Python 的这几个技巧分享

2019年6月17日

| 杂谈

尽管本人已经使用Python编程有多年了，今天仍然惊奇于这种语言所能让代码表现出的整洁和对DRY编程原则的适用。这些年来的经历让我学到了很多的小技巧和知识，大多数是通过阅读很流行的开源软件，如Django, Flask,Requests中获得的。

下面我挑选出的这几个技巧常常会被人们忽略，但它们在日常编程中能真正的给我们带来不少帮助。

……

阅读全文

Sharing These Python Tips

2019年6月17日

| Miscellaneous

Despite having programmed in Python for many years, I’m still amazed by how clean the code can be and how well it adheres to the DRY (Don’t Repeat Yourself) programming principle. My experience over the years has taught me many small tricks and pieces of knowledge, mostly gained from reading popular open-source software like Django, Flask, and Requests.

Here are a few tips I’ve picked out that are often overlooked, but can genuinely help us in daily programming.

1. Dictionary Comprehensions and Set Comprehensions

Most Python programmers know and use list comprehensions. If you’re not familiar with the concept of list comprehensions, it’s a shorter, more concise way to create a list.

>>> some_list = [1, 2, 3, 4]

>>> another_list = [ x + 1 for x in some_list ]

>>> another_list
[2, 3, 4, 5]

Since Python 3, we can use the same syntax to create sets and dictionaries:

……

阅读全文

春江暮客

How to Enable IPv6 Access on Alibaba Cloud ECS

1. Enable IPv6 Access

2. Get an IPv6 Address via Tunnelbroker

如何给阿里云ecs开启ipv6访问

Using Soft-EtherVPN to Set Up OpenVPN to Bypass Webpage Authentication

使用Soft-EtherVP-N搭建openvpn绕过网页认证

python中的序列化与反序列化

Serialization and Deserialization in Python

Build a Google Mirror Site Using Docker

Requirements:

I chose an Amazon VPS. Since I couldn’t find a good CentOS image, I used Ubuntu as the operating system.

1. Set domain resolution

利用docker容器技术搭建Google镜像

搭建条件：

Extracting Free High Anonymity Proxies with Python3

Main Modules:

Crawling Proxies with `requests`, using xici as an example

Update and Check Usable Proxies

python3提取免费高匿代理

主要以下几个模块

requests爬取代理，选取xici代理为例

Python 的这几个技巧分享

Sharing These Python Tips

1. Dictionary Comprehensions and Set Comprehensions

最近文章

分类

标签

友情链接

其它

1. Enable IPv6 Access

2. Get an IPv6 Address via Tunnelbroker

Requirements:

I chose an Amazon VPS. Since I couldn’t find a good CentOS image, I used Ubuntu as the operating system.

1. Set domain resolution

搭建条件：

Main Modules:

Crawling Proxies with requests, using xici as an example

Update and Check Usable Proxies

主要以下几个模块

requests爬取代理，选取xici代理为例

1. Dictionary Comprehensions and Set Comprehensions

最近文章

分类

标签

友情链接

其它

Crawling Proxies with `requests`, using xici as an example