-
-
Save d-wasserman/e9c98be1d0caebc2935afecf0ba239a0 to your computer and use it in GitHub Desktop.
import arcpy | |
import pandas as pd | |
def arcgis_table_to_df(in_fc, input_fields=None, query=""): | |
"""Function will convert an arcgis table into a pandas dataframe with an object ID index, and the selected | |
input fields using an arcpy.da.SearchCursor. | |
:param - in_fc - input feature class or table to convert | |
:param - input_fields - fields to input to a da search cursor for retrieval | |
:param - query - sql query to grab appropriate values | |
:returns - pandas.DataFrame""" | |
OIDFieldName = arcpy.Describe(in_fc).OIDFieldName | |
if input_fields: | |
final_fields = [OIDFieldName] + input_fields | |
else: | |
final_fields = [field.name for field in arcpy.ListFields(in_fc)] | |
data = [row for row in arcpy.da.SearchCursor(in_fc,final_fields,where_clause=query)] | |
fc_dataframe = pd.DataFrame(data,columns=final_fields) | |
fc_dataframe = fc_dataframe.set_index(OIDFieldName,drop=True) | |
return fc_dataframe | |
def arcgis_table_to_dataframe(in_fc, input_fields, query="", skip_nulls=False, null_values=None): | |
"""Function will convert an arcgis table into a pandas dataframe with an object ID index, and the selected | |
input fields. Uses TableToNumPyArray to get initial data. | |
:param - in_fc - input feature class or table to convert | |
:param - input_fields - fields to input into a da numpy converter function | |
:param - query - sql like query to filter out records returned | |
:param - skip_nulls - skip rows with null values | |
:param - null_values - values to replace null values with. | |
:returns - pandas dataframe""" | |
OIDFieldName = arcpy.Describe(in_fc).OIDFieldName | |
if input_fields: | |
final_fields = [OIDFieldName] + input_fields | |
else: | |
final_fields = [field.name for field in arcpy.ListFields(in_fc)] | |
np_array = arcpy.da.TableToNumPyArray(in_fc, final_fields, query, skip_nulls, null_values) | |
object_id_index = np_array[OIDFieldName] | |
fc_dataframe = pd.DataFrame(np_array, index=object_id_index, columns=input_fields) | |
return fc_dataframe |
I got following
Exception: Data must be 1-dimensional
. And it turns out that the@Shape
field may cause the problem. So I add adrop_shape
argument to the functions.# -*- encoding: utf-8 -*- # Functionality: Convert feature class to pandas dataframe # Original Author: [d-wasserman](https://gist.github.com/d-wasserman) # Modified by [Choumingzhao](https://gist.github.com/Choumingzhao) # Source: https://gist.github.com/d-wasserman/e9c98be1d0caebc2935afecf0ba239a0 import arcpy import pandas as pd def fc_to_df(in_fc, input_fields=None, drop_shape=True, query=""): """Function will convert an arcgis feature class table into a pandas dataframe with an object ID index, and the selected input fields using an arcpy.da.SearchCursor. :param - in_fc - input feature class or table to convert :param - input_fields - fields to input to a da search cursor for retrieval :param - drop_shape - drop the shape field from the dataframe :param - query - sql query to grab appropriate values :returns - pandas.DataFrame""" OIDFieldName = arcpy.Describe(in_fc).OIDFieldName if input_fields: final_fields = [OIDFieldName] + input_fields else: final_fields = [field.name for field in arcpy.ListFields(in_fc)] if drop_shape and u"Shape" in final_fields: final_fields.remove(u"Shape") data = [row for row in arcpy.da.SearchCursor(in_fc,final_fields,where_clause=query)] fc_dataframe = pd.DataFrame(data, columns=final_fields) fc_dataframe = fc_dataframe.set_index(OIDFieldName,drop=True) return fc_dataframe def fc_to_df2(in_fc, input_fields=None, drop_shape=True, query="", skip_nulls=False, null_values=None): """Function will convert an arcgis feature class table into a pandas dataframe with an object ID index, and the selected input fields. Uses TableToNumPyArray to get initial data. :param - in_fc - input feature class or table to convert :param - input_fields - fields to input into a da numpy converter function :param - drop_shape - drop the shape field from the dataframe :param - query - sql like query to filter out records returned :param - skip_nulls - skip rows with null values :param - null_values - values to replace null values with. :returns - pandas dataframe""" OIDFieldName = arcpy.Describe(in_fc).OIDFieldName if input_fields: final_fields = [OIDFieldName] + input_fields else: final_fields = [field.name for field in arcpy.ListFields(in_fc)] if drop_shape and u"Shape" in final_fields: final_fields.remove(u"Shape") np_array = arcpy.da.TableToNumPyArray(in_fc, final_fields, query, skip_nulls, null_values) object_id_index = np_array[OIDFieldName] fc_dataframe = pd.DataFrame(np_array, index=object_id_index, columns=input_fields) return fc_dataframe
I might suggest using a describe object to make sure you actually get the geometry field. I can take a look at the drop shape component.
thanks for your usefull function, but sometimes I've problems.
I resolve with this:
def table_to_data_frame(in_table, input_fields=None, where_clause=None):
OIDFieldName = arcpy.Describe(in_table).OIDFieldName
if input_fields:
final_fields = [OIDFieldName] + input_fields
else:
final_fields = [field.name for field in arcpy.ListFields(in_table)]
df = pd.DataFrame(columns=final_fields)
data = [row for row in arcpy.da.SearchCursor(feature, final_fields)]
df = pd.DataFrame(data, columns =final_fields)
return df
it seems more fast, thanks again!
I think I need to see the full function definition to understand what you are suggesting @salvuccigianluigi. In my experience, arcgis_table_to_df is faster.
Does anyone have an easy way to go backwards? After converting the table to a DataFrame, I want to do some modifications to the DataFrame and then update the original feature's attribute table to reflect it.
Thanks!!
There are a few approaches to this.
- Use the ArcGIS Python API and work with a spatially enabled dataframe and export back instead of these functions.
- Use ExtendTable. It is very finicky and works only with numeric columns reliably. It can work with text with some set up I have been told, but in practice I stay away.
- Use insert/update cursors. This takes some set up, but you can iterrows and approach it this way.
- Export to CSV, arcpy.TableToTable, join fields. This is pretty slow, but can be faster if you do field indexes. I don't prefer this approach.
Does anyone have an easy way to go backwards? After converting the table to a DataFrame, I want to do some modifications to the DataFrame and then update the original feature's attribute table to reflect it.
Thanks!!
You could retain the geometry by reading the entire feature class with geopandas, then write it to a shapefile or a geopackage and read it back into ArcGIS. This can also be used to perform functions in geopandas that you may not have a license for in ArcGIS.
There are a few approaches to this.
1. Use the ArcGIS Python API and work with a spatially enabled dataframe and export back instead of these functions. 2. Use [ExtendTable](https://github.com/d-wasserman/arc-sampling-and-scoring/blob/f5f442675447eda2c7da7822789bde054d65952b/Scripts/PercentileScoreFields.py#L89). It is very finicky and works only with numeric columns reliably. It can work with text with some set up I have been told, but in practice I stay away. 3. Use insert/update cursors. This takes some set up, but you can iterrows and approach it this way. 4. Export to CSV, arcpy.TableToTable, join fields. This is pretty slow, but can be faster if you do field indexes. I don't prefer this approach.
Hi GIS Luke,
Is there a limitation to your first option? I usually only use arcpy and not arcgis as I always thought the arcgis API was only for web applications.
ESRI writes:
ArcPy and the ArcGIS API for Python are complimentary libraries; ArcPy allows you to use, automate and extend desktop GIS, and the ArcGIS API for Python supports the same for web GIS.
If I can work with arcpy and sedf from arcgis on local data without any Enterprise/ArcGISOnline dependencies, it is not complementary to me. On the contrary, it is a good addition to arcpy!
They don't emphasize the fact it can create SEDF locally and then back, but just that functionality alone makes it a good complement these days.
look here
to read from local FC
sdf = pd.DataFrame.spatial.from_featureclass("path\to\your\data\census_example\census.gdb\cities")
sdf.head()
write it back to local fc
sdf.spatial.to_featureclass(location=r"c:\output_examples\census.gdb\cities");
the problen with this it someting chagnge the type on numerc fields (int to float)
Has anyone encountered a ValueError: Object: Error in accessing describe while trying to access 'arcgis.Describe()'? I didn't encounter any error messages previously, but suddenly this error occurs.
I got following
Exception: Data must be 1-dimensional
. And it turns out that the@Shape
field may cause the problem. So I add adrop_shape
argument to the functions.