You may need to install the following libraries (if you don't already have them)
install.packages("RODBC")
install.packages("dplyr")
Load these packages
library(RODBC)
library(dplyr)
There are two methods for connecting with a MS SQL server. For Windows the way most commonly documented is to create an ODBC connection, and then connect to it from R. This is documented here.
Once created, you can use the following command to set up the connection, where dsn
is the name of the connection that you specified.
Note that this method will necessitate you setting a default database.
conn <- odbcConnect(
dsn = "My_ODBC_connection"
)
A slightly more generic method which I favour is to use the odbcDriverConnect()
function instead.
Using this method does not require a connect to be set up in Windows first, and may provide a better cross-platform solution.
It also allows you to quickly change the default database, which may be beneficial (although from memory, using fully qualified table names [database].[schema].[table] means that setting a default database is not too important).
conn <- odbcDriverConnect(
'driver={SQL Server},server=3DCPRI-PDB16/ACSQLS,database=KS4,trusted_connection=true'
)
Once we have a conn
object, we can start interacting with the database.
To load a list of tables in the default database:
sqlTables(
conn,
tableType = "TABLE"
)
To get a list of columns in a given table:
sqlColumns(
conn,
"example_table"
)
To send a short query to the database:
sqlQuery(
conn,
"select count(*) from example_table;"
)
Writing longer queries in this command is a bit of a pain, but there is a nice solution to this problem given here.
First define a function to read in .sql
files.
read_sql <- function(path){
# Check that the file actually exists
stopifnot(file.exists(path))
# Read all the characters from the given file
sql <- readChar(path, nchar = file.info(path)$size)
# Print the sql query
sql
}
Now we can use the pipe %>%
from dplyr
to pipe our sql queries directly to the sqlQuery()
command, making it much easier to run very long SQL queries, and still allowing us to run them in SQL without a load of copying and pasting.
"sql_query.sql" %>%
read_sql %>%
sqlQuery(conn, .)
More conventionally this would be:
sqlQuery(conn, read_sql("sql_query.sql"))
Note that you must not include USE database; GO
commands in the query, if you are setting a database in this way, it will not work. Instead either set the default database in the dbcDriverConnect()
command, or specify your tables with a fully qualified path, like: [database].[schema].[table].