-
-
Save d-wasserman/e9c98be1d0caebc2935afecf0ba239a0 to your computer and use it in GitHub Desktop.
import arcpy | |
import pandas as pd | |
def arcgis_table_to_df(in_fc, input_fields=None, query=""): | |
"""Function will convert an arcgis table into a pandas dataframe with an object ID index, and the selected | |
input fields using an arcpy.da.SearchCursor. | |
:param - in_fc - input feature class or table to convert | |
:param - input_fields - fields to input to a da search cursor for retrieval | |
:param - query - sql query to grab appropriate values | |
:returns - pandas.DataFrame""" | |
OIDFieldName = arcpy.Describe(in_fc).OIDFieldName | |
if input_fields: | |
final_fields = [OIDFieldName] + input_fields | |
else: | |
final_fields = [field.name for field in arcpy.ListFields(in_fc)] | |
data = [row for row in arcpy.da.SearchCursor(in_fc,final_fields,where_clause=query)] | |
fc_dataframe = pd.DataFrame(data,columns=final_fields) | |
fc_dataframe = fc_dataframe.set_index(OIDFieldName,drop=True) | |
return fc_dataframe | |
def arcgis_table_to_dataframe(in_fc, input_fields, query="", skip_nulls=False, null_values=None): | |
"""Function will convert an arcgis table into a pandas dataframe with an object ID index, and the selected | |
input fields. Uses TableToNumPyArray to get initial data. | |
:param - in_fc - input feature class or table to convert | |
:param - input_fields - fields to input into a da numpy converter function | |
:param - query - sql like query to filter out records returned | |
:param - skip_nulls - skip rows with null values | |
:param - null_values - values to replace null values with. | |
:returns - pandas dataframe""" | |
OIDFieldName = arcpy.Describe(in_fc).OIDFieldName | |
if input_fields: | |
final_fields = [OIDFieldName] + input_fields | |
else: | |
final_fields = [field.name for field in arcpy.ListFields(in_fc)] | |
np_array = arcpy.da.TableToNumPyArray(in_fc, final_fields, query, skip_nulls, null_values) | |
object_id_index = np_array[OIDFieldName] | |
fc_dataframe = pd.DataFrame(np_array, index=object_id_index, columns=input_fields) | |
return fc_dataframe |
Over here
https://my.usgs.gov/confluence/x/YSxKI
I found a useful snippet to go back to arcgis table, or swap the arcpy.da.NumPyArrayToTable for arcpy.da.NumPyArrayToFeatureClass if you have XY attributes.
'Shape' can be resource intensive with polygons. Since my workflow involved going back to arcgis, I also had to drop 'Shape_Area' and 'Shape_Length' so I added the below line.
[final_fields.remove(bad_fld for bad_fld in ['Shape', 'Shape_Area', 'Shape_Length']
Over here
https://my.usgs.gov/confluence/x/YSxKI
I found a useful snippet to go back to arcgis table, or swap the arcpy.da.NumPyArrayToTable for arcpy.da.NumPyArrayToFeatureClass if you have XY attributes.
'Shape' can be resource intensive with polygons. Since my workflow involved going back to arcgis, I also had to drop 'Shape_Area' and 'Shape_Length' so I added the below line.
[final_fields.remove(bad_fld for bad_fld in ['Shape', 'Shape_Area', 'Shape_Length']
Nice! Something I found out recently that might make it a little cleaner is the required property of fields. It might help with filtering out derived/geo fields.
This is so useful - exactly what I was looking for! Would love to see this added to arcpy or the ArcGIS Python API. I haven't found anything comparable in either.
This is so useful - exactly what I was looking for! Would love to see this added to arcpy or the ArcGIS Python API. I haven't found anything comparable in either.
Thanks! I keep on using it as it still seems faster and dependable than the API. The trick I am running into now is join functions, and have had some success there. Let me know if you can share any back! I want to move beyond record joins.
THANK YOU !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! So ...... Umm how can I change this to get me .aliasName and description row Values.
Thanks, I am using the edited version from @vitale232 and that works like a charm!
The most updated version should match the functionality provided by @vitale232. Is there an error in the current gist?
thanks, this is wonderful :)
solved me some problems
Not enough information is provided to assist. Can you provide a full snippet?
Sorry, I was not able to visualize the data using the code below:
def table_to_data_frame(in_table, input_fields=None, where_clause="insert_date >= TO_DATE('01/01/2021','DD/MM/YYYY') and insert_date < TO_DATE('02/01/2021','DD/MM/YYYY')"):
"""Function will convert an arcgis table into a pandas dataframe with an object ID index, and the selected
input fields using an arcpy.da.SearchCursor."""
OIDFieldName = arcpy.Describe(in_table).OIDFieldName
if input_fields:
final_fields = [OIDFieldName] + input_fields
else:
final_fields = [field.name for field in arcpy.ListFields(in_table)]
data = [row for row in arcpy.da.SearchCursor(in_table, final_fields, where_clause=where_clause)]
fc_dataframe = pd.DataFrame(data, columns=final_fields)
fc_dataframe = fc_dataframe.set_index(OIDFieldName, drop=True)
return fc_dataframe
Than I realize that I needed to cal it using "fc_dataframe = table_to_data_frame(in_table)" at the ends. And also that my where_clause is invalid. If i remove the SQL query it's works. I teste other less complex queries and it's also works. My issue now is how to query by date. Before converting the table I want to filter by date.
This modification is beyond the scope of the gist. My only advice is check that you are using the appropriate flavor of SQL with regard to temporal queries for your RDBMS. Thanks.
Thank you very much! An excellent tool! Very useful!
I got following Exception: Data must be 1-dimensional
. And it turns out that the @Shape
field may cause the problem. So I add a drop_shape
argument to the functions.
# -*- encoding: utf-8 -*-
# Functionality: Convert feature class to pandas dataframe
# Original Author: [d-wasserman](https://gist.github.com/d-wasserman)
# Modified by [Choumingzhao](https://gist.github.com/Choumingzhao)
# Source: https://gist.github.com/d-wasserman/e9c98be1d0caebc2935afecf0ba239a0
import arcpy
import pandas as pd
def fc_to_df(in_fc, input_fields=None, drop_shape=True, query=""):
"""Function will convert an arcgis feature class table into a pandas dataframe with an object ID index, and the selected
input fields using an arcpy.da.SearchCursor.
:param - in_fc - input feature class or table to convert
:param - input_fields - fields to input to a da search cursor for retrieval
:param - drop_shape - drop the shape field from the dataframe
:param - query - sql query to grab appropriate values
:returns - pandas.DataFrame"""
OIDFieldName = arcpy.Describe(in_fc).OIDFieldName
if input_fields:
final_fields = [OIDFieldName] + input_fields
else:
final_fields = [field.name for field in arcpy.ListFields(in_fc)]
if drop_shape and u"Shape" in final_fields:
final_fields.remove(u"Shape")
data = [row for row in arcpy.da.SearchCursor(in_fc,final_fields,where_clause=query)]
fc_dataframe = pd.DataFrame(data, columns=final_fields)
fc_dataframe = fc_dataframe.set_index(OIDFieldName,drop=True)
return fc_dataframe
def fc_to_df2(in_fc, input_fields=None, drop_shape=True, query="", skip_nulls=False, null_values=None):
"""Function will convert an arcgis feature class table into a pandas dataframe with an object ID index, and the selected
input fields. Uses TableToNumPyArray to get initial data.
:param - in_fc - input feature class or table to convert
:param - input_fields - fields to input into a da numpy converter function
:param - drop_shape - drop the shape field from the dataframe
:param - query - sql like query to filter out records returned
:param - skip_nulls - skip rows with null values
:param - null_values - values to replace null values with.
:returns - pandas dataframe"""
OIDFieldName = arcpy.Describe(in_fc).OIDFieldName
if input_fields:
final_fields = [OIDFieldName] + input_fields
else:
final_fields = [field.name for field in arcpy.ListFields(in_fc)]
if drop_shape and u"Shape" in final_fields:
final_fields.remove(u"Shape")
np_array = arcpy.da.TableToNumPyArray(in_fc, final_fields, query, skip_nulls, null_values)
object_id_index = np_array[OIDFieldName]
fc_dataframe = pd.DataFrame(np_array, index=object_id_index, columns=input_fields)
return fc_dataframe
I got following
Exception: Data must be 1-dimensional
. And it turns out that the@Shape
field may cause the problem. So I add adrop_shape
argument to the functions.# -*- encoding: utf-8 -*- # Functionality: Convert feature class to pandas dataframe # Original Author: [d-wasserman](https://gist.github.com/d-wasserman) # Modified by [Choumingzhao](https://gist.github.com/Choumingzhao) # Source: https://gist.github.com/d-wasserman/e9c98be1d0caebc2935afecf0ba239a0 import arcpy import pandas as pd def fc_to_df(in_fc, input_fields=None, drop_shape=True, query=""): """Function will convert an arcgis feature class table into a pandas dataframe with an object ID index, and the selected input fields using an arcpy.da.SearchCursor. :param - in_fc - input feature class or table to convert :param - input_fields - fields to input to a da search cursor for retrieval :param - drop_shape - drop the shape field from the dataframe :param - query - sql query to grab appropriate values :returns - pandas.DataFrame""" OIDFieldName = arcpy.Describe(in_fc).OIDFieldName if input_fields: final_fields = [OIDFieldName] + input_fields else: final_fields = [field.name for field in arcpy.ListFields(in_fc)] if drop_shape and u"Shape" in final_fields: final_fields.remove(u"Shape") data = [row for row in arcpy.da.SearchCursor(in_fc,final_fields,where_clause=query)] fc_dataframe = pd.DataFrame(data, columns=final_fields) fc_dataframe = fc_dataframe.set_index(OIDFieldName,drop=True) return fc_dataframe def fc_to_df2(in_fc, input_fields=None, drop_shape=True, query="", skip_nulls=False, null_values=None): """Function will convert an arcgis feature class table into a pandas dataframe with an object ID index, and the selected input fields. Uses TableToNumPyArray to get initial data. :param - in_fc - input feature class or table to convert :param - input_fields - fields to input into a da numpy converter function :param - drop_shape - drop the shape field from the dataframe :param - query - sql like query to filter out records returned :param - skip_nulls - skip rows with null values :param - null_values - values to replace null values with. :returns - pandas dataframe""" OIDFieldName = arcpy.Describe(in_fc).OIDFieldName if input_fields: final_fields = [OIDFieldName] + input_fields else: final_fields = [field.name for field in arcpy.ListFields(in_fc)] if drop_shape and u"Shape" in final_fields: final_fields.remove(u"Shape") np_array = arcpy.da.TableToNumPyArray(in_fc, final_fields, query, skip_nulls, null_values) object_id_index = np_array[OIDFieldName] fc_dataframe = pd.DataFrame(np_array, index=object_id_index, columns=input_fields) return fc_dataframe
I might suggest using a describe object to make sure you actually get the geometry field. I can take a look at the drop shape component.
thanks for your usefull function, but sometimes I've problems.
I resolve with this:
def table_to_data_frame(in_table, input_fields=None, where_clause=None):
OIDFieldName = arcpy.Describe(in_table).OIDFieldName
if input_fields:
final_fields = [OIDFieldName] + input_fields
else:
final_fields = [field.name for field in arcpy.ListFields(in_table)]
df = pd.DataFrame(columns=final_fields)
data = [row for row in arcpy.da.SearchCursor(feature, final_fields)]
df = pd.DataFrame(data, columns =final_fields)
return df
it seems more fast, thanks again!
I think I need to see the full function definition to understand what you are suggesting @salvuccigianluigi. In my experience, arcgis_table_to_df is faster.
Does anyone have an easy way to go backwards? After converting the table to a DataFrame, I want to do some modifications to the DataFrame and then update the original feature's attribute table to reflect it.
Thanks!!
There are a few approaches to this.
- Use the ArcGIS Python API and work with a spatially enabled dataframe and export back instead of these functions.
- Use ExtendTable. It is very finicky and works only with numeric columns reliably. It can work with text with some set up I have been told, but in practice I stay away.
- Use insert/update cursors. This takes some set up, but you can iterrows and approach it this way.
- Export to CSV, arcpy.TableToTable, join fields. This is pretty slow, but can be faster if you do field indexes. I don't prefer this approach.
Does anyone have an easy way to go backwards? After converting the table to a DataFrame, I want to do some modifications to the DataFrame and then update the original feature's attribute table to reflect it.
Thanks!!
You could retain the geometry by reading the entire feature class with geopandas, then write it to a shapefile or a geopackage and read it back into ArcGIS. This can also be used to perform functions in geopandas that you may not have a license for in ArcGIS.
There are a few approaches to this.
1. Use the ArcGIS Python API and work with a spatially enabled dataframe and export back instead of these functions. 2. Use [ExtendTable](https://github.com/d-wasserman/arc-sampling-and-scoring/blob/f5f442675447eda2c7da7822789bde054d65952b/Scripts/PercentileScoreFields.py#L89). It is very finicky and works only with numeric columns reliably. It can work with text with some set up I have been told, but in practice I stay away. 3. Use insert/update cursors. This takes some set up, but you can iterrows and approach it this way. 4. Export to CSV, arcpy.TableToTable, join fields. This is pretty slow, but can be faster if you do field indexes. I don't prefer this approach.
Hi GIS Luke,
Is there a limitation to your first option? I usually only use arcpy and not arcgis as I always thought the arcgis API was only for web applications.
ESRI writes:
ArcPy and the ArcGIS API for Python are complimentary libraries; ArcPy allows you to use, automate and extend desktop GIS, and the ArcGIS API for Python supports the same for web GIS.
If I can work with arcpy and sedf from arcgis on local data without any Enterprise/ArcGISOnline dependencies, it is not complementary to me. On the contrary, it is a good addition to arcpy!
They don't emphasize the fact it can create SEDF locally and then back, but just that functionality alone makes it a good complement these days.
look here
to read from local FC
sdf = pd.DataFrame.spatial.from_featureclass("path\to\your\data\census_example\census.gdb\cities")
sdf.head()
write it back to local fc
sdf.spatial.to_featureclass(location=r"c:\output_examples\census.gdb\cities");
the problen with this it someting chagnge the type on numerc fields (int to float)
Has anyone encountered a ValueError: Object: Error in accessing describe while trying to access 'arcgis.Describe()'? I didn't encounter any error messages previously, but suddenly this error occurs.
Looking forward to seeing what you have!