How to export a DataFrame to Excel
In this post we will see how we can export a Pandas DataFrame into Microsoft Excel. This is useful for when we have manipulated our data and we want to generate a final report to share with others or do other calculations within the Excel program.
The “trick” here is to use the XlsxWriter engine instead of the default way of exporting. The reason for this is because, in this way, we can create an Excel file with more than one sheets. Again, this is very useful when we want to organize our data into a more sophisticated way. For example we can have one sheet with all the “raw data” and another one with an outline or condensed report.
Let’s suppose we have a big dataset consisting of product data. It can be for example our inventory of products in various product categories. We would like to create an excel file that has three sheets. The first sheet has all the Raw Data in case we want to manipulate them from scratch in Excel. The other two sheets can have our data split into product categories. Let’s assume we only have product category “A” & product category “B”. We can use “Product Category A” and “Product Category “B” in the those Excel sheet names respectively.
# We import the Pandas Library import Pandas as pd # We will use a variable to set the name of the final excel file. generatedfile = 'Products.xlsx'
Here is how we use the XLsxWriter engine:
# Create a Pandas Excel writer using XlsxWriter as the engine. writer = pd.ExcelWriter(generatedfile, engine='xlsxwriter')
We then perform our data manipulation with Python and Pandas library. Let’s assume in this case we have created three different DataFrames. the DataFrame: “dfrawdata” contains all the Raw Data we have created or imported in our system. Then, after manipulating the Raw Data we could supposedly created two different new DataFrames. One contains the data of Product Category A and the other of the Product Category B (see this example of how you can filter your data in this way here). Let’s call those DataFrames: “dfcategorya” & “dfcategoryb” respectively.
And here is the last part of our code on how to export our DataFrame into Excel:
dfrawdata.to_excel(writer, sheet_name='Raw Data',index=False) dfcategorya.to_excel(writer, sheet_name='Product Category A',index=False) dfcategoryb.to_excel(writer, sheet_name='Product Category B',index=False)
DataFrames automatically create a numerical index when we import or create a new Dataset. Since that index is not part of our original “business” data we can remove it by passing the index=False argument in our example above.