Aggregations¶
Aggregations create a new value by summarizing a Column. For
example, Mean, when applied to a column containing Number
data, returns a single decimal.Decimal value which is the average of
all values in that column.
Aggregations can be applied to single columns using the Table.aggregate()
method. The result is a single value if a one aggregation was applied, or
a tuple of values if a sequence of aggregations was applied.
Aggregations can be applied to instances of TableSet using the
TableSet.aggregate() method. The result is a new Table
with a column for each aggregation and a row for each table in the set.
| Aggregations create a new value by summarizing a  | |
| Apply an arbitrary function to a column. | 
Basic aggregations¶
| Check if all values in a column pass a test. | |
| Check if any value in a column passes a test. | |
| Count occurences of a value or values. | |
| Check if the column contains null values. | |
| Find the minimum value in a column. | |
| Find the maximum value in a column. | |
| Find the most decimal places present for any value in this column. | 
Statistical aggregations¶
| Calculate the deciles of a column based on its percentiles. | |
| Calculate the interquartile range of a column. | |
| Calculate the median absolute deviation of a column. | |
| Calculate the mean of a column. | |
| Calculate the median of a column. | |
| Calculate the mode of a column. | |
| Divide a column into 100 equal-size groups using the "CDF" method. | |
| Calculate the population standard of deviation of a column. | |
| Calculate the population variance of a column. | |
| Calculate the quartiles of column based on its percentiles. | |
| Calculate the quintiles of a column based on its percentiles. | |
| Calculate the sample standard of deviation of a column. | |
| Calculate the sum of a column. | |
| Calculate the sample variance of a column. | 
Text aggregations¶
| Find the length of the longest string in a column. | 
Detailed list¶
- class agate.Aggregation¶
- Bases: - object- Aggregations create a new value by summarizing a - Column.- Aggregations are applied with - Table.aggregate()and- TableSet.aggregate().- When creating a custom aggregation, ensure that the values returned by - Aggregation.run()are of the type specified by- Aggregation.get_aggregate_data_type(). This can be ensured by using the- DataType.cast()method. See- Summaryfor an example.- get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.All(column_name, test)¶
- Bases: - Aggregation- Check if all values in a column pass a test. - Parameters:
- column_name – The name of the column to check. 
- test – Either a single value that all values in the column are compared against (for equality) or a function that takes a column value and returns True or False. 
 
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 
- class agate.Any(column_name, test)¶
- Bases: - Aggregation- Check if any value in a column passes a test. - Parameters:
- column_name – The name of the column to check. 
- test – Either a single value that all values in the column are compared against (for equality) or a function that takes a column value and returns True or False. 
 
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.Count(column_name=None, value=<object object>)¶
- Bases: - Aggregation- Count occurences of a value or values. - This aggregation can be used in three ways: - If no arguments are specified, then it will count the number of rows in the table. 
- If only - column_nameis specified, then it will count the number of non-null values in that column.
- If both - column_nameand- valueare specified, then it will count occurrences of a specific value.
 - Parameters:
- column_name – The column containing the values to be counted. 
- value – Any value to be counted, including - None.
 
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.Deciles(column_name)¶
- Bases: - Aggregation- Calculate the deciles of a column based on its percentiles. - Deciles will be equivalent to the 10th, 20th … 90th percentiles. - “Zeroth” (min value) and “Tenth” (max value) deciles are included for reference and intuitive indexing. - See - Percentilesfor implementation details.- This aggregation can not be applied to a - TableSet.- Parameters:
- column_name – The name of a column containing - Numberdata.
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 
- class agate.HasNulls(column_name)¶
- Bases: - Aggregation- Check if the column contains null values. - Parameters:
- column_name – The name of the column to check. 
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.IQR(column_name)¶
- Bases: - Aggregation- Calculate the interquartile range of a column. - Parameters:
- column_name – The name of a column containing - Numberdata.
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.MAD(column_name)¶
- Bases: - Aggregation- Calculate the median absolute deviation of a column. - Parameters:
- column_name – The name of a column containing - Numberdata.
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.Min(column_name)¶
- Bases: - Aggregation- Find the minimum value in a column. - This aggregation can be applied to columns containing - Date,- DateTime, or- Numberdata.- Parameters:
- column_name – The name of the column to be searched. 
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.Max(column_name)¶
- Bases: - Aggregation- Find the maximum value in a column. - This aggregation can be applied to columns containing - Date,- DateTime, or- Numberdata.- Parameters:
- column_name – The name of the column to be searched. 
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.MaxLength(column_name)¶
- Bases: - Aggregation- Find the length of the longest string in a column. - Note: On Python 2.7 this function may miscalcuate the length of unicode strings that contain “wide characters”. For details see this StackOverflow answer: https://stackoverflow.com/a/35462951 - Parameters:
- column_name – The name of a column containing - Textdata.
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 
- class agate.MaxPrecision(column_name)¶
- Bases: - Aggregation- Find the most decimal places present for any value in this column. - Parameters:
- column_name – The name of the column to be searched. 
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.Mean(column_name)¶
- Bases: - Aggregation- Calculate the mean of a column. - Parameters:
- column_name – The name of a column containing - Numberdata.
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.Median(column_name)¶
- Bases: - Aggregation- Calculate the median of a column. - Median is equivalent to the 50th percentile. See - Percentilesfor implementation details.- Parameters:
- column_name – The name of a column containing - Numberdata.
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.Mode(column_name)¶
- Bases: - Aggregation- Calculate the mode of a column. - Parameters:
- column_name – The name of a column containing - Numberdata.
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.Percentiles(column_name)¶
- Bases: - Aggregation- Divide a column into 100 equal-size groups using the “CDF” method. - See this explanation of the various methods for computing percentiles. - “Zeroth” (min value) and “Hundredth” (max value) percentiles are included for reference and intuitive indexing. - A reference implementation was provided by pycalcstats. - This aggregation can not be applied to a - TableSet.- Parameters:
- column_name – The name of a column containing - Numberdata.
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 
- class agate.PopulationStDev(column_name)¶
- Bases: - StDev- Calculate the population standard of deviation of a column. - For the sample standard of deviation see - StDev.- Parameters:
- column_name – The name of a column containing - Numberdata.
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.PopulationVariance(column_name)¶
- Bases: - Variance- Calculate the population variance of a column. - For the sample variance see - Variance.- Parameters:
- column_name – The name of a column containing - Numberdata.
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.Quartiles(column_name)¶
- Bases: - Aggregation- Calculate the quartiles of column based on its percentiles. - Quartiles will be equivalent to the the 25th, 50th and 75th percentiles. - “Zeroth” (min value) and “Fourth” (max value) quartiles are included for reference and intuitive indexing. - See - Percentilesfor implementation details.- This aggregation can not be applied to a - TableSet.- Parameters:
- column_name – The name of a column containing - Numberdata.
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 
- class agate.Quintiles(column_name)¶
- Bases: - Aggregation- Calculate the quintiles of a column based on its percentiles. - Quintiles will be equivalent to the 20th, 40th, 60th and 80th percentiles. - “Zeroth” (min value) and “Fifth” (max value) quintiles are included for reference and intuitive indexing. - See - Percentilesfor implementation details.- This aggregation can not be applied to a - TableSet.- Parameters:
- column_name – The name of a column containing - Numberdata.
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 
- class agate.StDev(column_name)¶
- Bases: - Aggregation- Calculate the sample standard of deviation of a column. - For the population standard of deviation see - PopulationStDev.- Parameters:
- column_name – The name of a column containing - Numberdata.
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.Sum(column_name)¶
- Bases: - Aggregation- Calculate the sum of a column. - Parameters:
- column_name – The name of a column containing - Numberdata.
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.Summary(column_name, data_type, func, cast=True)¶
- Bases: - Aggregation- Apply an arbitrary function to a column. - Parameters:
- column_name – The name of a column to be summarized. 
- data_type – The return type of this aggregation. 
- func – A function which will be passed the column for processing. 
- cast – If - True, each return value will be cast to the specified- data_typeto ensure it is valid. Only disable this if you are certain your summary always returns the correct type.
 
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - run(table)¶
- Execute this aggregation on a given column and return the result. 
 
- class agate.Variance(column_name)¶
- Bases: - Aggregation- Calculate the sample variance of a column. - For the population variance see - PopulationVariance.- Parameters:
- column_name – The name of a column containing - Numberdata.
 - get_aggregate_data_type(table)¶
- Get the data type that should be used when using this aggregation with a - TableSetto produce a new column.- Should raise - UnsupportedAggregationErrorif this column does not support aggregation into a- TableSet. (For example, if it does not return a single value.)
 - validate(table)¶
- Perform any checks necessary to verify this aggregation can run on the provided table without errors. This is called by - Table.aggregate()before- run().
 - run(table)¶
- Execute this aggregation on a given column and return the result.