Pandas is one of the most popular data analysis libraries in Python. It is primarily used to manipulate and analyze data in a tabular form called a DataFrame. One of the challenges data analysts face is working with float data types. Sometimes, it is convenient to convert the float data to integer types. But how do you do that with Pandas? In this article, we explore different ways of converting Pandas float data to integer types.
Let's start the tutorial by creating a simple DataFrame.
import pandas as pd
data = {'name': ['Alice', 'Bob', 'Charlie'],
'Age': [23.0, 27.5, 29.3],
'Salary': [15000.50, 23000.00, 27000.50]}
df = pd.DataFrame(data)
Here, we have three columns name, Age, and Salary. The Age and Salary columns are float data types.
Option 1: Use the .astype() method
The easiest way to convert float data to integer data types is by using the .astype() method. This method converts a series to a specified data type, in this case, int data type.
df['Age'] = df['Age'].astype(int)
df['Salary'] = df['Salary'].astype(int)
As shown above, we convert the Age and Salary columns to int data type. We apply the .astype() method to the column and pass the int data type as an argument.
The .astype() method applies a simple truncation, i.e., it rounds down the float values to the nearest integer. So, the Age values 23.0, 27.5, and 29.3 will be converted to 23, 27, and 29, respectively. Similarly, the Salary values will be converted by discarding the decimal part of the numbers. Note that this method will throw an exception if the float values are not compatible with int data types.
Option 2: Use the round() function
Another way to convert float data to integers is by using the builtin round() function. The round() function takes a float argument and returns the nearest integer. We can apply this function to each element in the series using the .apply() method.
df['Age'] = df['Age'].apply(lambda x: round(x))
df['Salary'] = df['Salary'].apply(lambda x: round(x))
We use the .apply() method to apply the lambda function to each element in the series. The lambda function takes an argument x and applies the round() function to it, returning the nearest integer. This method is more flexible than .astype() since it can handle values that cannot be converted to int data types. However, note that this method is slower than using the .astype() method.
Option 3: Use the floor() function
If you want to convert float data to integers by rounding down, you can use the math.floor() function. The math.floor() function takes a float argument and rounds it down to the nearest integer. Just like in the previous method, we use the .apply() method to apply the math.floor() function to each element in the series.
import math
df['Age'] = df['Age'].apply(lambda x: math.floor(x))
df['Salary'] = df['Salary'].apply(lambda x: math.floor(x))
Here, we use the same approach as in the previous method but use the math.floor() function instead of the round() function. This method is useful when you want to round down the values to the nearest integer.
Option 4: Use the .round() method
Another option to convert float data to integer types is to use the .round() method. The .round() method allows you to round the values to a specified number of decimal places, and then convert the resultant data types to integers using the .astype() method.
df['Age'] = df['Age'].round(decimals=0).astype(int)
df['Salary'] = df['Salary'].round(decimals=0).astype(int)
Here, we first round the float values to zero decimal places, and then convert them to integer types using the .astype() method. This method is handy when you want to round the values to a particular number of decimal places and then convert them to integers.
Conclusion
Converting float data types to integer data types is an essential step in data preprocessing. In this article, we discussed four different ways to convert Pandas float data to integer types. We covered the .astype() method, round() function, math.floor() function, and the .round() method. Each method has its pros and cons, and choosing the right method depends on the specific use case.
let's dive deeper into the previous topics and explore some additional details and use cases.
Using the .astype() method
The .astype() method is the most common and straightforward way to convert float data types to integer data types in Pandas. However, it is essential to note that this method will throw a ValueError if a float value cannot be converted to an integer data type. For example, if a float value is too large or too small to fit into an int data type, or if it includes decimal values, the .astype() method will raise this exception. It is crucial to handle this exception properly in your code to prevent any unexpected termination of your program.
On the other hand, the .astype() method is helpful when you want to explicitly convert the data type of the column or series, and you know the values in the column or series can be safely converted to integers. For example, if you have a column 'Year' with float data types such as 2020.0, 2019.0, and 2018.0, you can use the .astype(int) method to convert them to integer data types.
Using the round() function
The round() function is a builtin Python function that rounds a floatingpoint number to a specified number of decimal places and returns a float value. However, if we want to convert float data to integer data types, we need to apply the round() function and then use the .apply() method to apply it to each value in the series. The round() function is useful when you want to round off the float values and keep them as floatingpoint numbers.
For example, if you have a column 'Temperature' with decimal values such as 25.34, 24.76, and 25.00, you can use the round() function to round the values to zero decimal places and keep them as float data types.
df['Temperature'] = df['Temperature'].apply(lambda x: round(x, 0))
This code uses the .apply() method to apply the lambda function to each element in the 'Temperature' column. The lambda function takes an argument 'x' and applies the round() function with the second argument set to zero, which rounds off the floatingpoint number to zero decimal places.
Using the floor() function
The math.floor() function, as discussed before, rounds down a floatingpoint number to the nearest integer and returns an integer value. This method is helpful when you want to discard the decimal part of the float values and keep them as integers.
For example, if you have a column 'Price' with float data types such as 23.50, 19.99, and 29.95, you can use the math.floor() function to convert the values to integer data types.
df['Price'] = df['Price'].apply(lambda x: math.floor(x))
This code uses the .apply() method to apply the lambda function to each element in the 'Price' column. The lambda function takes an argument 'x' and applies the math.floor() function to it, which rounds down the float value to the nearest integer.
Using the .round() method
The .round() method, as mentioned before, rounds off the float values to a specified number of decimal places and returns the resultant data types as floats. However, if we want to convert the rounded float values to integer data types, we can use the .round() method along with the .astype() method.
For example, if you have a column 'Discount' with float data types such as 0.25, 0.10, and 0.05, you can use the .round() method to round off the values to two decimal places and then convert them to integer data types.
df['Discount'] = df['Discount'].round(2).astype(int)
This code first applies the .round() method to round off the float values in the 'Discount' column to two decimal places. Then, it uses the .astype() method to convert the resultant float values to integer data types.
Conclusion
Converting float data types to integer data types in Pandas can be done using several methods such as the .astype() method, round() function, math.floor() function, and .round() method. Each method has its use cases, and choosing the right method depends on your specific data manipulation requirements. It is essential to handle exceptions properly and ensure that you are not losing any data during the conversion process.
Popular questions
Sure, here are five questions and their answers related to the topic of converting float to int in Pandas:

What is the most common method for converting float to int in Pandas?
Answer: The .astype() method is the most common method for converting float to int in Pandas. 
What is the purpose of the round() function when converting float values to integers?
Answer: The round() function is used to round off the float values to a specified number of decimal places before converting them to integer data types. 
What exception will be thrown if a float value cannot be converted to an integer data type using the .astype() method?
Answer: The .astype() method will throw a ValueError exception if a float value cannot be converted to an integer data type. 
When should you use the math.floor() function to convert float values to integer data types?
Answer: You should use the math.floor() function when you want to discard the decimal part of the float values and keep them as integers. 
How can you round off the float values to a specified number of decimal places and then convert them to integer data types in Pandas?
Answer: You can use the .round() method to round off the float values to a specified number of decimal places and then use the .astype() method to convert them to integer data types.
Tag
"Pandas Typecasting"