python - Serialization optimization using Marshmallow, other solutions -
this seems should straightforward, alas:
i have following sqlalchemy query object:
all = db.session.query(label('sid', distinct(clinical.patient_sid))).all()
with desired serialize output [{'sid': 1}, {'sid': 2},...]
to this, trying use following simple marshmallow schema:
class testschema(schema): sid = fields.int()
however, when
schema = testschema() result = schema.dump(record) print result pprint(result.data)
i get:
marshalresult(data={}, errors={}) {}
for output.
however, when select 1 row query, e.g.,
one_record = db.session.query(label('sid', distinct(clinical.patient_sid))).first()
i desired results:
marshalresult(data={u'sid': 1}, errors={}) {u'sid': 1}
i know query .all() returning data, since when print list of tuples:
[(1l,), (2l,), (3l,), ...]
i assuming marshmallow can handle list of tuples, since, in documentation marshaling.py under serialize method, says: "takes raw data (a dict, list, or other object) , dict of..." however, may incorrect assumption think lists of tuples classified either "lists" or "other objects."
i marshmallow otherwise, , hoping use optimization on serializing sqlalchemy output using iterative method, like:
all = db.session.query(label('sid', distinct(clinical.patient_sid))) out = [] result in all: data = {'sid': result.sid} out.append(data)
which, large records sets can take while process.
edit
even if marshmallow able serialize entire record set output sqlalchemy, not sure increase in speed, since looks iterates on data.
any suggestions optimized serialization sqlalchemy output, short of modifying class definition clinical?
the solution optimize code go directly sqlalchemy query object pandas data frame (i forgot mention doing heavy lifting in pandas after queried record set).
i able skip step
out = [] result in all: data = {'sid': result.sid out.append(data)
by using sql_read
method of pandas follows:
import pandas pd pd.read_sql(all.statement, all.session.bind)
and doing data manipulations , gyrations, thereby shaving off several seconds of processing time.
Comments
Post a Comment