NumPy and Pandas are very comprehensive, efficient, and flexible Python tools for data manipulation. An important concept for proficient users of these two libraries to understand is how data are referenced as shallow copies (views) and deep copies (or just copies). Pandas sometimes issues a SettingWithCopyWarning
to warn the user of a potentially inappropriate use of views and copies.
In this article, you’ll learn:
- What views and copies are in NumPy and Pandas
- How to properly work with views and copies in NumPy and Pandas
- Why the
SettingWithCopyWarning
happens in Pandas - How to avoid getting a
SettingWithCopyWarning
in Pandas
You’ll first see a short explanation of what the SettingWithCopyWarning
is and how to avoid it. You might find this enough for your needs, but you can also dig a bit deeper into the details of NumPy and Pandas to learn more about copies and views.
Free Bonus: Click here to get access to a free NumPy Resources Guide that points you to the best tutorials, videos, and books for improving your NumPy skills.
Prerequisites
To follow the examples in this article, you’ll need Python 3.7 or 3.8, as well as the libraries NumPy and Pandas. This article is written for NumPy version 1.18.1 and Pandas version 1.0.3. You can install them with pip
:
$ python -m pip install -U "numpy==1.18.*" "pandas==1.0.*"
If you prefer Anaconda or Miniconda distributions, you can use the conda package management system. To learn more about this approach, check out Setting Up Python for Machine Learning on Windows. For now, it’ll be enough to install NumPy and Pandas in your environment:
$ conda install numpy=1.18.* pandas=1.0.*
Now that you have NumPy and Pandas installed, you can import them and check their versions: