分类 Technology 中的文章

Parallelism in One Line of Python Code

Python has a somewhat notorious reputation when it comes to program parallelization. Technical issues aside, such as thread implementation and the GIL, I believe incorrect teaching guidance is the main problem. Common classic Python multithreading and multiprocessing tutorials often seem “heavy” and tend to scratch the surface without deeply exploring the most useful content for daily work.……

阅读全文

Python Implementation of Classic Sorting Algorithms (1)

In computer science, a sorting algorithm is an algorithm that arranges a list of data in a specific order. The most commonly used sorting methods are numerical order and lexicographical (dictionary) order. Efficient sorting algorithms are crucial in various other algorithms. Sorting algorithms are also used in processing text data and generating human-readable output.

Basically, the output of a sorting algorithm must adhere to the following two principles:

  1. The output result is an increasing sequence (increasing refers to the desired sort order).
  2. The output result is a permutation or rearrangement of the original input.

The 10 classic sorting algorithms can be divided into two main categories:

Non-linear time comparison-based sorting: These algorithms determine the relative order of elements by comparing them. Since their time complexity cannot break through $O(n log n)$, they are called non-linear time comparison-based sorting algorithms.

Linear time non-comparison-based sorting: These algorithms do not determine the relative order of elements by comparison. They can break through the lower bound of comparison-based sorting and run in linear time, hence they are called linear time non-comparison-based sorting algorithms.

……

阅读全文

Detailed Examples of Seaborn Plotting Kernel Density Curves

In a frequency distribution histogram, when the sample size is sufficiently enlarged to its limit, and the bin width is infinitely shortened, the step-like broken line in the frequency histogram will evolve into a smooth curve. This curve is called the density distribution curve of the population.

In this article, Chunjing Muke will detail how to use the Python plotting library Seaborn and the Iris flower dataset from Pandas to plot various cool density curves.


1. Basic Density Curve

    import seaborn as sns
    sns.set(color_codes=True)
    sns.set_style("white")
    df = pd.read_csv('iris.csv')
    sns.kdeplot(df['sepal_width'])

‘Detailed Examples of Seaborn Plotting Kernel Density Curves’

To plot a kernel density curve using Seaborn, you only need to use kdeplot. Note that a density curve only requires one variable; here we choose the sepal_width column.


2. Density Curve with Shading

    import seaborn as sns
    sns.set(color_codes=True)
    sns.set_style("white")
    df = pd.read_csv('iris.csv')
    sns.kdeplot(df['sepal_width'],shade=True)

‘Detailed Examples of Seaborn Plotting Kernel Density Curves’

……

阅读全文

Drawing a Stunning "Dream of the Red Chamber" Word Cloud with Python 3

Word clouds, which I’m sure you’ve all seen, are created using wordcloud, a famous Python library. This article will detail how to use wordcloud to create a word cloud for “Dream of the Red Chamber,” one of China’s Four Great Classical Novels.


1. Preparation

This involves three parts:

2. The wordcloud and jieba libraries, which can be installed using pip install wordcloud and pip install jieba.

3. Preparing a Chinese font file.

The .txt text file and font file are bundled together for your convenience to replicate this tutorial’s example.


2. Drawing the “Dream of the Red Chamber” Word Cloud

Here’s the code directly:

    from wordcloud import WordCloud
    import jieba
    text = "".join(jieba.cut(open("红楼梦.txt").read()))
    wordcloud = WordCloud(font_path="kaibold.ttf").generate(text)

    # Display the generated image:
    plt.imshow(wordcloud, interpolation='bilinear')
    plt.axis("off")
    plt.margins(x=0, y=0)
    plt.show()

《Drawing a Stunning “Dream of the Red Chamber” Word Cloud with Python 3》

……

阅读全文

TypeError: ufunc 'isnan' not supported for the input types - Solution

Today, while using Python’s Seaborn to plot a heatmap (clustermap), I kept encountering this error. My data seemed perfectly fine, and a Google search didn’t yield any good solutions. After some exploration, I’m sharing the final solution here.


1. Generating the DataFrame

    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    from seaborn import clustermap
    import seaborn as sns; sns.set(color_codes=True)
    df = pd.DataFrame([["a","b","c","d","e","f"],[1,2,3,4,5,6],[2,3,4,5,6,7],[3,4,5,6,7,8]],  columns=list('ABCDEF')).T
    df
    g = sns.clustermap(df.iloc[:,1:],cmap="PiYG")

After generating and transposing the DataFrame, a TypeError occurs: TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule "safe".

《TypeError: ufunc ‘isnan’ not supported for the input types solution》


2. Cause of the Error

This type of error arises because the DataFrame has been transposed, and the original DataFrame contained string columns. Just like in the example above, the first column contains strings (values ‘abcdef’). When transposed, all numerical values in the DataFrame are also converted to object types instead of float or int numerical types. Therefore, trying to plot a heatmap with character types naturally leads to an error.

……

阅读全文

TypeError: ufunc 'isnan' not supported for the input types - Solution

After generating and transposing the DataFrame, a TypeError occurred: TypeError: ufunc ‘isnan’ not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule “safe”.

《TypeError: ufunc ‘isnan’ not supported for the input types - Solution》

2. Cause of the Error

This type of error occurs because the DataFrame has been transposed, and the original DataFrame contains a column with strings. Just like in the example above, the first column contains string values “abcdef”. After transposition, all numbers in the DataFrame also become “object” type instead of “float” or “int” numeric types. Therefore, when we try to plot a heatmap with character types, an error naturally occurs.

If the DataFrame originally contained only numeric types, there would be no issue here.

3. Solution

Knowing the cause, the solution is simple: convert the corresponding numeric columns in the transposed DataFrame to numeric types. Here’s the code:

……

阅读全文

Python Implementation for Kugou Music MP3 Download

After implementing python for Qianqian Music mp3 download, some users found that many songs couldn’t be searched on Qianqian Music. So today, Chunjian Muke extended the download functionality to Kugou Music, with source code provided.

Using the same approach, first search for a song directly on the Kugou official website. Then, open the network monitor in Google Chrome and search for the same keyword again. You’ll then be able to find the API information (Note: It’s best to view the network requests during the second search to filter out unnecessary information).


1. Analyzing Search API Information

《Python Implementation for Kugou Music MP3 Download》 With only 4 network requests, it’s easy to identify that the first request genuinely returns song information, so we can construct this request.

《Python Implementation for Kugou Music MP3 Download》

……

阅读全文

Drawing NetworkX Network Graphs in python3

NetworkX is a Python library for studying graphs and networks. NetworkX is free software released under the BSD-new license. It can be used to create and manipulate complex networks, and to study the structure and function of complex networks.

With NetworkX, you can load or store networks in standard or non-standard data formats. It can generate many types of random or classic networks, analyze network structure, build network models, design new network algorithms, and draw networks.

Of course, NetworkX alone cannot be powerful. Here, Chunjian Muke will use other widely used common Python libraries to draw various basic network graphs.


1. Drawing the Most Basic Network Graph

A network graph consists of nodes and edges. In NetworkX, each row of a pandas DataFrame represents the points in a connection, and a connection is generated at the corresponding position. In the example, a connection is generated between each corresponding position of ‘from’ and ’to’.

……

阅读全文