What is the problem this feature will solve?
Currently Astropy Quanity objects are converted to float64 when parsed into a pandas DataFrame, for example through the QTable.to_pandas() method. Thereby they use there unit attribute making them lose a lot of usefulness. There is a way around that by using an object dtype for the pd.Series, but that is quite inefficient, especially when multiple operations shall be done on the objects later on leading to a need of constantly parsing between pd.Series and Quantity back and forth.
In 2019/20 @janpipek developed the pandas-units-extension package to enable native support of Quantity objects in pandas using pandas ExtensionDtype/ExtensionArray. Some months ago I forked the project (see pandas-units-extension (fork) to make it compatible with modern pandas. In the past months I updated the implementation, updated the tests and added some more features. It is now in a state that I think it is ready to be published again.
I already introduced the project in last months "Astropy Dev Telecon" we came to the conclusion that longer term this could move from a separate package into astropy's core package, but that there are also some things left to be discussed (for example the dtype string representation, see Issue #7). I would be available again during this weeks meeting of someone would be interested to have a chat. If my notes are correct, then @taldcroft and @neutrinoceros where interested in the topic, but I should also tag the astropy.units core maintainers, which according to the team page should be @nstarman and @mhvk.
After a successful integration one could also think on supporting more complex objects like Skycoord or masked Quantity. Based on the Astropy EA I already started developing another EA for the Skyfield API (Pandas Skyfield Extension), depending on the level of functionality that can be done rather quick.
Describe the desired outcome
Full support for Quantity objects in pandas Series and DataFrame, including:
- Conversion from/to
Quantity/QTable
- Conversion to other units, including
equivalencies maps
- Conversion from string, like from a
csv file
- Arithmetic and comparison operations between
Series/DataFrame and another pandas object or astropy Quantity
- Numeric reductions like
sum, max, std, ...
- Concatenation of different units of same physical type (e.g.
m and ft)
Some examples:
Create a pandas Series containing Quantity objects:
>>> q: u.Quantity = [1, 2, 3] * u.m
>>> q
<Quantity [1., 2., 3.] m>
>>> s: pd.Series = pd.Series(q, dtype="unit")
>>> s
0 1.0 m
1 2.0 m
2 3.0 m
dtype: unit[m]
Comparison operations:
>>> length_sr > 150 * u.cm
0 False
1 True
2 True
dtype: bool
Arithmetic operations:
>>> velocity_sr: pd.Series = length_sr / (1 * u.s)
>>> velocity_sr
0 1.0 m / s
1 2.0 m / s
2 3.0 m / s
dtype: unit[m / s]
Conversion to other units via custom SeriesAccessor:
>>> velocity_sr.units.to(u.km/u.h)
0 3.6 km / h
1 7.2 km / h
2 10.8 km / h
dtype: unit[km / h]
Convert back to Quantity:
>>> velocity_sr.units.to_quantity()
<Quantity [1., 2., 3.] m / s>
This just as an example for now. Desirably one would also alter the QTable.to_pandas() and QTable.from_pandas() to be unit-aware by utilizing the UnitsDtype.
Additional context
No response
What is the problem this feature will solve?
Currently Astropy
Quanityobjects are converted tofloat64when parsed into a pandasDataFrame, for example through theQTable.to_pandas()method. Thereby they use thereunitattribute making them lose a lot of usefulness. There is a way around that by using anobjectdtypefor thepd.Series, but that is quite inefficient, especially when multiple operations shall be done on the objects later on leading to a need of constantly parsing betweenpd.SeriesandQuantityback and forth.In 2019/20 @janpipek developed the pandas-units-extension package to enable native support of
Quantityobjects in pandas using pandasExtensionDtype/ExtensionArray. Some months ago I forked the project (see pandas-units-extension (fork) to make it compatible with modern pandas. In the past months I updated the implementation, updated the tests and added some more features. It is now in a state that I think it is ready to be published again.I already introduced the project in last months "Astropy Dev Telecon" we came to the conclusion that longer term this could move from a separate package into astropy's core package, but that there are also some things left to be discussed (for example the dtype string representation, see Issue #7). I would be available again during this weeks meeting of someone would be interested to have a chat. If my notes are correct, then @taldcroft and @neutrinoceros where interested in the topic, but I should also tag the
astropy.unitscore maintainers, which according to the team page should be @nstarman and @mhvk.After a successful integration one could also think on supporting more complex objects like
Skycoordor maskedQuantity. Based on the Astropy EA I already started developing another EA for the Skyfield API (Pandas Skyfield Extension), depending on the level of functionality that can be done rather quick.Describe the desired outcome
Full support for
Quantityobjects in pandasSeriesandDataFrame, including:Quantity/QTableequivalenciesmapscsvfileSeries/DataFrameand another pandas object or astropyQuantitysum,max,std, ...mandft)Some examples:
Create a pandas Series containing Quantity objects:
Comparison operations:
Arithmetic operations:
Conversion to other units via custom
SeriesAccessor:Convert back to
Quantity:This just as an example for now. Desirably one would also alter the
QTable.to_pandas()andQTable.from_pandas()to be unit-aware by utilizing theUnitsDtype.Additional context
No response