python - using sqlalchemy to load csv file into a database -
i trying learn program in python. csv files in database. idea use sqlalchemy framework insert the data.
each file database table. of these files have foreign keys other csv file / db tables.
thanks !
because of power of sqlalchemy, i'm using on project. it's power comes object-oriented way of "talking" database instead of hardcoding sql statements can pain manage. not mention, it's lot faster.
to answer question bluntly, yes! storing data csv database using sqlalchemy piece of cake. here's full working example (i used sqlalchemy 1.0.6 , python 2.7.6):
from numpy import genfromtxt time import time datetime import datetime sqlalchemy import column, integer, float, date sqlalchemy.ext.declarative import declarative_base sqlalchemy import create_engine sqlalchemy.orm import sessionmaker def load_data(file_name): data = genfromtxt(file_name, delimiter=',', skip_header=1, converters={0: lambda s: str(s)}) return data.tolist() base = declarative_base() class price_history(base): #tell sqlalchemy table name , if there's table-specific arguments should know __tablename__ = 'price_history' __table_args__ = {'sqlite_autoincrement': true} #tell sqlalchemy name of column , attributes: id = column(integer, primary_key=true, nullable=false) date = column(date) opn = column(float) hi = column(float) lo = column(float) close = column(float) vol = column(float) if __name__ == "__main__": t = time() #create database engine = create_engine('sqlite:///csv_test.db') base.metadata.create_all(engine) #create session session = sessionmaker() session.configure(bind=engine) s = session() try: file_name = "t.csv" #sample csv file used: http://www.google.com/finance/historical?q=nyse%3at&ei=w4ikvam8lywjmagjhohacw&output=csv data = load_data(file_name) in data: record = price_history(**{ 'date' : datetime.strptime(i[0], '%d-%b-%y').date(), 'opn' : i[1], 'hi' : i[2], 'lo' : i[3], 'close' : i[4], 'vol' : i[5] }) s.add(record) #add records s.commit() #attempt commit records except: s.rollback() #rollback changes on error finally: s.close() #close connection print "time elapsed: " + str(time() - t) + " s." #0.091s
(note: not "best" way this, think format readable beginner; it's fast: 0.091s 251 records inserted!)
i think if go through line line, you'll see breeze use. notice lack of sql statements -- hooray! took liberty of using numpy load csv contents in 2 lines, can done without if like.
if wanted compare against traditional way of doing it, here's full-working example reference:
import sqlite3 import time numpy import genfromtxt def dict_factory(cursor, row): d = {} idx, col in enumerate(cursor.description): d[col[0]] = row[idx] return d def create_db(db): #create db , format needed sqlite3.connect(db) conn: conn.row_factory = dict_factory conn.text_factory = str cursor = conn.cursor() cursor.execute("create table [price_history] ([id] integer primary key autoincrement not null unique, [date] date, [opn] float, [hi] float, [lo] float, [close] float, [vol] integer);") def add_record(db, data): #insert record table sqlite3.connect(db) conn: conn.row_factory = dict_factory conn.text_factory = str cursor = conn.cursor() cursor.execute("insert price_history({cols}) values({vals});".format(cols = str(data.keys()).strip('[]'), vals=str([data[i] in data]).strip('[]') )) def load_data(file_name): data = genfromtxt(file_name, delimiter=',', skiprows=1, converters={0: lambda s: str(s)}) return data.tolist() if __name__ == "__main__": t = time.time() db = 'csv_test_sql.db' #database filename file_name = "t.csv" #sample csv file used: http://www.google.com/finance/historical?q=nyse%3at&ei=w4ikvam8lywjmagjhohacw&output=csv data = load_data(file_name) #get data csv create_db(db) #create db #for every record, format , insert table in data: record = { 'date' : i[0], 'opn' : i[1], 'hi' : i[2], 'lo' : i[3], 'close' : i[4], 'vol' : i[5] } add_record(db, record) print "time elapsed: " + str(time.time() - t) + " s." #3.604s
(note: in "old" way, no means best way this, it's readable , "1-to-1" translation sqlalchemy way vs. "old" way.)
notice the sql statements: 1 create table, other insert records. also, notice it's bit more cumbersome maintain long sql strings vs. simple class attribute addition. liking sqlalchemy far?
as foreign key inquiry, of course. sqlalchemy has power too. here's example of how class attribute foreign key assignment (assuming foreignkey
class has been imported sqlalchemy
module):
class asset_analysis(base): #tell sqlalchemy table name , if there's table-specific arguments should know __tablename__ = 'asset_analysis' __table_args__ = {'sqlite_autoincrement': true} #tell sqlalchemy name of column , attributes: id = column(integer, primary_key=true, nullable=false) fid = column(integer, foreignkey('price_history.id'))
which points "fid" column foreign key price_history's id column.
hope helps!
Comments
Post a Comment