Python is a versatile programming language, and it provides a lot of functionality for working with different types of data. One of the most common file formats for storing data is Microsoft Excel's XLSX format. In this article, we will show you how to read an XLSX file in Python using the openpyxl library.
What is openpyxl?
openpyxl is a third-party library in Python that provides functionality for working with Microsoft Excel files. It supports reading and writing XLSX files and provides a lot of features for working with Excel data. It is a very popular library and is widely used in many applications.
Installing openpyxl
To use openpyxl, you need to install it first. You can do this by running the following command in your terminal or command prompt:
pip install openpyxl
Reading an XLSX file in Python
To read an XLSX file in Python using openpyxl, you need to follow the steps below:
- Import the
openpyxl
library. - Load the XLSX file using the
load_workbook()
method. - Get the worksheet you want to read data from.
- Loop through the rows and columns in the worksheet and print the values.
Here's the code that implements these steps:
import openpyxl
# Load the XLSX file
workbook = openpyxl.load_workbook("sample.xlsx")
# Get the worksheet
worksheet = workbook.active
# Loop through the rows and columns
for row in worksheet.iter_rows():
for cell in row:
print(cell.value, end=" ")
print()
Example 1: Reading an XLSX file with multiple worksheets
In some cases, you may have an XLSX file with multiple worksheets. To read data from a specific worksheet, you need to use the get_sheet_by_name()
method. Here's the code:
import openpyxl
# Load the XLSX file
workbook = openpyxl.load_workbook("sample.xlsx")
# Get the worksheet
worksheet = workbook.get_sheet_by_name("Sheet1")
# Loop through the rows and columns
for row in worksheet.iter_rows():
for cell in row:
print(cell.value, end=" ")
print()
Example 2: Reading an XLSX file and storing the data in a list
If you want to store the data from the XLSX file in a list, you can modify the code from Example 1 as follows:
import openpyxl
# Load the XLSX file
workbook = openpyxl.load_workbook("sample.xlsx")
# Get the worksheet
worksheet = workbook.active
# Create an empty list
data = []
# Loop through the rows and columns
for row in worksheet.iter_rows():
row_data = []
for cell in row:
row_data.append(cell.value)
data.append(row_data)
# Print the data
print(data)
``
## Writing to an XLSX file in Python using openpyxl
In addition to reading XLSX files, openpyxl also provides functionality for writing to XLSX files. The steps for writing to an XLSX file are similar to those for reading. Here's the code:
import openpyxl
Create a new workbook
workbook = openpyxl.Workbook()
Get the active worksheet
worksheet = workbook.active
Write some data to the worksheet
worksheet["A1"] = "Name"
worksheet["B1"] = "Age"
worksheet["A2"] = "John"
worksheet["B2"] = 30
Save the workbook
workbook.save("sample_write.xlsx")
## Other Features of openpyxl
openpyxl provides many more features for working with XLSX files. Here are some of the most commonly used ones:
1. Formatting cells: You can format cells in an XLSX file using openpyxl. This includes setting font style, background color, etc.
2. Merging cells: You can merge cells in an XLSX file using openpyxl. This is useful when you want to display data in a specific way.
3. Formula calculation: You can use formulas in XLSX files using openpyxl. For example, you can use the SUM formula to calculate the sum of values in a column.
4. Adding charts: You can add charts to an XLSX file using openpyxl. This is useful for visualizing data.
For more information on these and other features, you can refer to the official openpyxl documentation.
## Conclusion
In this article, we showed you how to read and write XLSX files in Python using the openpyxl library. We demonstrated how to read data from XLSX files with multiple worksheets, and how to store the data in a list. We also discussed some of the other features of openpyxl, including formatting cells, merging cells, formula calculation, and adding charts.
## Popular questions
1. What is the library used to read XLSX files in Python?
- The library used to read XLSX files in Python is openpyxl.
2. How do you install the openpyxl library in Python?
- You can install the openpyxl library in Python using the pip package manager by running the following command in your terminal or command prompt: `pip install openpyxl`.
3. How do you read data from an XLSX file in Python using openpyxl?
- To read data from an XLSX file in Python using openpyxl, you need to first load the workbook using the `openpyxl.load_workbook` function, and then access the worksheet you want to read from. You can then use the cell indexing method to read the data in a particular cell, for example `worksheet['A1'].value`.
4. How do you read data from multiple worksheets in an XLSX file in Python using openpyxl?
- To read data from multiple worksheets in an XLSX file in Python using openpyxl, you need to first load the workbook using the `openpyxl.load_workbook` function, and then access each worksheet you want to read from using the `workbook.worksheets` property. You can then use the cell indexing method to read the data in a particular cell, for example `worksheet['A1'].value`.
5. What are some of the other features provided by the openpyxl library for working with XLSX files in Python?
- Some of the other features provided by the openpyxl library for working with XLSX files in Python include formatting cells, merging cells, formula calculation, and adding charts.
### Tag
Excel