numpy 最详细教程（数据科学初学者必知的NumPy基础知识）

本文介绍了一些 NumPy 基础知识，适合数据科学初学者学习掌握，我来为大家科普一下关于numpy 最详细教程?以下内容希望对你有帮助!

numpy 最详细教程

本文介绍了一些 NumPy 基础知识，适合数据科学初学者学习掌握。

NumPy（Numerical Python）是 Python 中的一个线性代数库。对每一个数据科学或机器学习 Python 包而言，这都是一个非常重要的库，SciPy（Scientific Python）、Mat-plotlib（plotting library）、Scikit-learn 等都在一定程度上依赖 NumPy。

对数组执行数学运算和逻辑运算时，NumPy 是非常有用的。在用 Python 对 n 维数组和矩阵进行运算时，NumPy 提供了大量有用特征。

这篇教程介绍了数据科学初学者需要了解的 NumPy 基础知识，包括如何创建 NumPy 数组、如何使用 NumPy 中的广播机制、如何获取值以及如何操作数组。更重要的是，大家可以通过本文了解到 NumPy 在 Python 列表中的优势：更简洁、更快速地读写项、更方便、更高效。

本教程将使用 Jupyter notebook 作为编辑器。

让我们开始吧！

安装 NumPy

如果你已经装有 Anaconda，那么你可以使用以下命令通过终端或命令提示符安装 NumPy：

conda install numpy

如果你没有 Anaconda，那么你可以使用以下命令从终端上安装 NumPy：

pip install numpy

安装好 NumPy 后，你就可以启动 Jupyter notebook 开始学习了。接下来从 NumPy 数组开始。

使用 random.rand()，我们可以生成一个从 0~1 均匀产生的随机数组成的数组。

例如，如果想要一个由 4 个对象组成的一维数组，且这 4 个对象均匀分布在 0~1，可以这样做：

my_rand = np.random.rand(4)

如果我们想要一个有 5 行 4 列的二维数组，则：

my_rand = np.random.rand(5, 4)my_rand

使用 randn()，我们可以从以 0 为中心的标准正态分布或高斯分布中产生随机样本。例如，我们这样生成 7 个随机数：

my_randn = np.random.randn(7) my_randn

绘制结果后会得到一个正态分布曲线。

同样地，如需创建一个 3 行 5 列的二维数组，这样做即可：

np.random.randn(3,5)

最后，我们可以使用 randint() 函数生成整数数组。randint() 函数最多可以有三个参数：最小值（包含），最大值（不包含）以及数组的大小。

np.random.randint(20) #generates a random integer exclusive of 20np.random.randint(2, 20) #generates a random integer including 2 but excluding 20 np.random.randint(2, 20, 7) #generates 7 random integers including 2 but excluding 20

将一维数组转换成二维数组

先创建一个有 25 个随机整数的一维数组：

arr = np.random.rand(25)

然后使用 reshape() 函数将其转换为二维数组：

arr.reshape(5,5)

注意：reshape() 仅可转换成行列数目相等，且行列数相乘后要与元素数量相等。上例中的 arr 包含 25 个元素，因此只能重塑为 5*5 的矩阵。

定位 NumPy 数组中的最大值和最小值

使用 max() 和 min() 函数，我们可以得到数组中的最大值或最小值：

arr_2 = np.random.randint(0, 20, 10)arr_2.max() #This gives the highest value in the array arr_2.min() #This gives the lowest value in the array

使用 argmax() 和 argmin() 函数，我们可以定位数组中最大值和最小值的索引：

arr_2.argmax() #This shows the index of the highest value in the array arr_2.argmin() #This shows the index of the lowest value in the array

假设存在大量数组，而你需要弄清楚数组的形态，你想知道这个数组是一维数组还是二维数组，只需要使用 shape 函数即可：

arr.shape

从 NumPy 数组中索引／选择多个元素（组）

在 NumPy 数组中进行索引与 Python 类似，只需输入想要的索引即可：

my_array = np.arange(0,11) my_array[8] #This gives us the value of element at index 8

为了获得数组中的一系列值，我们可以使用切片符「:」，就像在 Python 中一样：

my_array[2:6] #This returns everything from index 2 to 6(exclusive)my_array[:6] #This returns everything from index 0 to 6(exclusive) my_array[5:] #This returns everything from index 5 to the end of the array.

类似地，我们也可以通过使用 [ ][ ] 或 [,] 在二维数组中选择元素。

使用 [ ][ ] 从下面的二维数组中抓取出值「60」：

two_d_arr = np.array([[10,20,30], [40,50,60], [70,80,90]]) two_d_arr[1][2] #The value 60 appears is in row index 1, and column index 2

使用 [,] 从上面的二维数组中抓取出值「20」：

two_d_arr[0,1]

也可以用切片符抓取二维数组的子部分。使用下面的操作从数组中抓取一些元素：

two_d_arr[:1, :2] # This returns [[10, 20]]two_d_arr[:2, 1:] # This returns ([[20, 30], [50, 60]]) two_d_arr[:2, :2] #This returns ([[10, 20], [40, 50]])

我们还可以索引一整行或一整列。只需使用索引数字即可抓取任意一行：

two_d_arr[0] #This grabs row 0 of the array ([10, 20, 30]) two_d_arr[:2] #This grabs everything before row 2 ([[10, 20, 30], [40, 50, 60]])

还可以使用 &、|、<、> 和 == 运算符对数组执行条件选择和逻辑选择，从而对比数组中的值和给定值：

new_arr = np.arange(5,15) new_arr > 10 #This returns TRUE where the elements are greater than 10 [False, False, False, False, False, False, True, True, True, True]

现在我们可以输出符合上述条件的元素：

bool_arr = new_arr > 10new_arr[bool_arr] #This returns elements greater than 10 [11, 12, 13, 14] new_arr[new_arr>10] #A shorter way to do what we have just done

组合使用条件运算符和逻辑运算符，我们可以得到值大于 6 小于 10 的元素：

new_arr[(new_arr>6) & (new_arr<10)]

预期结果为：([7, 8, 9])

广播机制

广播机制是一种快速改变 NumPy 数组中的值的方式。

my_array[0:3] = 50#Result is: [50, 50, 50, 3, 4, 5, 6, 7, 8, 9, 10]

在这个例子中，我们将索引为 0 到 3 的元素的初始值改为 50。

对 NumPy 数组执行数学运算

arr = np.arange(1,11)arr * arr #Multiplies each element by itselfarr - arr #Subtracts each element from itselfarr arr #Adds each element to itself arr / arr #Divides each element by itself

我们还可以对数组执行标量运算，NumPy 通过广播机制使其成为可能：

arr 50 #This adds 50 to every element in that array

NumPy 还允许在数组上执行通用函数，如平方根函数、指数函数和三角函数等。

np.sqrt(arr) #Returns the square root of each elementnp.exp(arr) #Returns the exponentials of each elementnp.sin(arr) #Returns the sin of each elementnp.cos(arr) #Returns the cosine of each elementnp.log(arr) #Returns the logarithm of each elementnp.sum(arr) #Returns the sum total of elements in the array np.std(arr) #Returns the standard deviation of in the array

我们还可以在二维数组中抓取行或列的总和：

mat = np.arange(1,26).reshape(5,5)mat.sum() #Returns the sum of all the values in matmat.sum(axis=0) #Returns the sum of all the columns in matmat.sum(axis=1) #Returns the sum of all the rows in mat

现在，这篇 NumPy 教程进入了尾声！希望对大家有所帮助。
,

numpy 最详细教程（数据科学初学者必知的NumPy基础知识）

最新推荐

热门推荐