Python-SQLALchemy

服务器

浏览数:243

2019-3-22

官方文档

Initialization

# 检查是否已经安装以及版本号
>>> import sqlalchemy
>>> sqlalchemy.__version__ 
’1.1.4‘
>>> from sqlalchemy.ext.declarative import declarative_base
# model都是要继承自Base
>>> Base = declarative_base()

>>> from sqlalchemy import Column, Integer, String
>>> class User(Base):
...     __tablename__ = 'users' # 指定数据表名
...
...     id = Column(Integer, primary_key=True)
...     name = Column(String(50))
...     fullname = Column(String(50))
...     password = Column(String(50))
...
...     def __repr__(self):
...        return "<User(name='%s', fullname='%s', password='%s')>" % (
...                             self.name, self.fullname, self.password)


# 查看创建的数据表结构
>>> User.__table__ 
Table('users', MetaData(bind=None),
            Column('id', Integer(), table=<users>, primary_key=True, nullable=False),
            Column('name', String(length=50), table=<users>),
            Column('fullname', String(length=50), table=<users>),
            Column('password', String(length=50), table=<users>), schema=None)

正式创建数据表

>>> from sqlalchemy import create_engine

# 连接到mysql
>>> engine = create_engine("mysql://root:root@localhost:3306/python?charset=utf8",
                           encoding="utf-8", echo=True)

# 正式创建数据表
>>> Base.metadata.create_all(engine)
CREATE TABLE users (
    id INTEGER NOT NULL AUTO_INCREMENT, 
    name VARCHAR(50), 
    fullname VARCHAR(50), 
    password VARCHAR(50), 
    PRIMARY KEY (id)
)

Creating a Session

下面的操作都是要通过会话对象操作

>>> from sqlalchemy.orm import sessionmaker
>>> Session = sessionmaker(bind=engine)
>>> session = Session()

Adding and Updating Objects

添加一个User对象

>>> ed_user = User(name='ed', fullname='Ed Jones', password='edspassword')
>>> session.add(ed_user)

查询一下,使用filter_by来过滤,first只列出第一个查询到的对象

>>> our_user = session.query(User).filter_by(name='ed').first()
BEGIN (implicit)
INSERT INTO users (name, fullname, password) VALUES (?, ?, ?)
('ed', 'Ed Jones', 'edspassword')

SELECT users.id AS users_id,
        users.name AS users_name,
        users.fullname AS users_fullname,
        users.password AS users_password
FROM users
WHERE users.name = ?
 LIMIT ? OFFSET ?
('ed', 1, 0)

>>> our_user
<User(name='ed', fullname='Ed Jones', password='edspassword')>

>>> ed_user is our_user
True

使用add_all,一次性添加多个对象

>>> session.add_all([
...     User(name='wendy', fullname='Wendy Williams', password='foobar'),
...     User(name='mary', fullname='Mary Contrary', password='xxg527'),
...     User(name='fred', fullname='Fred Flinstone', password='blah')])

Session很智能,比如说,它知道Ed Jones被修改了

# 可以直接修改ed_user对象
>>> ed_user.password = 'f8s7ccs'

# session会自动知道哪些数据被修改了
>>> session.dirty
IdentitySet([<User(name='ed', fullname='Ed Jones', password='f8s7ccs')>])

# session也可以知道哪些对象被新建了
>>> session.new
IdentitySet([<User(name='wendy', fullname='Wendy Williams', password='foobar')>,
<User(name='mary', fullname='Mary Contrary', password='xxg527')>,
<User(name='fred', fullname='Fred Flinstone', password='blah')>])

对数据库进行了变更,自然要进行commit,从echo语句我们可以看出,我们更新了1个对象,创建了3个对象。

>>> session.commit()
UPDATE users SET password=? WHERE users.id = ?
('f8s7ccs', 1)
INSERT INTO users (name, fullname, password) VALUES (?, ?, ?)
('wendy', 'Wendy Williams', 'foobar')
INSERT INTO users (name, fullname, password) VALUES (?, ?, ?)
('mary', 'Mary Contrary', 'xxg527')
INSERT INTO users (name, fullname, password) VALUES (?, ?, ?)
('fred', 'Fred Flinstone', 'blah')
COMMIT

>>> ed_user.id
BEGIN (implicit)
SELECT users.id AS users_id,
        users.name AS users_name,
        users.fullname AS users_fullname,
        users.password AS users_password
FROM users
WHERE users.id = ?
(1,)
1

Rolling Back

因为Session是工作在一个transaction内部,有时候我们可能不小心做了一些误删除的操作,可以回滚。我们先修改ed_user的用户名为Edwardo,然后重新添加一个新User,但是记住这个时候我们还没有commit

>>> ed_user.name = 'Edwardo'
and we’ll add another erroneous user, fake_user:

>>> fake_user = User(name='fakeuser', fullname='Invalid', password='12345')
>>> session.add(fake_user)
Querying the session, we can see that they’re flushed into the current transaction:

查询检验一下

>>> session.query(User).filter(User.name.in_(['Edwardo', 'fakeuser'])).all()
UPDATE users SET name=? WHERE users.id = ?
('Edwardo', 1)
INSERT INTO users (name, fullname, password) VALUES (?, ?, ?)
('fakeuser', 'Invalid', '12345')
SELECT users.id AS users_id,
        users.name AS users_name,
        users.fullname AS users_fullname,
        users.password AS users_password
FROM users
WHERE users.name IN (?, ?)
('Edwardo', 'fakeuser')
[<User(name='Edwardo', fullname='Ed Jones', password='f8s7ccs')>, <User(name='fakeuser', fullname='Invalid', password='12345')>]

回滚,我们可以知道ed_user‘s name is back to ed以及fake_user has been kicked out of the session

>>> session.rollback()
ROLLBACK

>>> ed_user.name
BEGIN (implicit)
SELECT users.id AS users_id,
        users.name AS users_name,
        users.fullname AS users_fullname,
        users.password AS users_password
FROM users
WHERE users.id = ?
(1,)
u'ed'

>>> fake_user in session
False
issuing a SELECT illustrates the changes made to the database:

这个时候再查询,很明显fakeuser已经消失了,ed用户的名字重新变回了ed而不是Edwordo

>>> session.query(User).filter(User.name.in_(['ed', 'fakeuser'])).all()
SELECT users.id AS users_id,
        users.name AS users_name,
        users.fullname AS users_fullname,
        users.password AS users_password
FROM users
WHERE users.name IN (?, ?)
('ed', 'fakeuser')
[<User(name='ed', fullname='Ed Jones', password='f8s7ccs')>]

Couting

用于查询操作相对应的count()操作

>>> session.query(User).filter(User.name.like('%ed')).count()
2

>>> from sqlalchemy import func
>>> session.query(func.count(User.name), User.name).group_by(User.name).all()
[(1, u'ed'), (1, u'fred'), (1, u'mary'), (1, u'wendy')]

Querying

一个通过在Session上使用query方法可以创建一个Query object

按照用户id进行排序来进行查询

>>> for instance in session.query(User).order_by(User.id):
...     print(instance.name, instance.fullname)
ed Ed Jones
wendy Wendy Williams
mary Mary Contrary
fred Fred Flinstone

query方法也可以接收ORM-instrumented descriptors作为参数。返回结果是一个named tuples

>>> for name, fullname in session.query(User.name, User.fullname):
...     print(name, fullname)
ed Ed Jones
wendy Wendy Williams
mary Mary Contrary
fred Fred Flinstone

The tuples returned by Query are named tuples, supplied by the KeyedTuple class, and can be treated much like an ordinary Python object. The names are the same as the attribute’s name for an attribute, and the class name for a class:

>>> for row in session.query(User, User.name).all():
...    print(row.User, row.name)
<User(name='ed', fullname='Ed Jones', password='f8s7ccs')> ed
<User(name='wendy', fullname='Wendy Williams', password='foobar')> wendy
<User(name='mary', fullname='Mary Contrary', password='xxg527')> mary
<User(name='fred', fullname='Fred Flinstone', password='blah')> fred

You can control the names of individual column expressions using the label() construct, which is available from any ColumnElement-derived object, as well as any class attribute which is mapped to one (such as User.name):

>>> for row in session.query(User.name.label('name_label')).all():
...    print(row.name_label)
ed
wendy
mary
fred

The name given to a full entity such as User, assuming that multiple entities are present in the call to query(), can be controlled using aliased() :

>>> from sqlalchemy.orm import aliased
>>> user_alias = aliased(User, name='user_alias')

>>> for row in session.query(user_alias, user_alias.name).all():
...    print(row.user_alias)
<User(name='ed', fullname='Ed Jones', password='f8s7ccs')>
<User(name='wendy', fullname='Wendy Williams', password='foobar')>
<User(name='mary', fullname='Mary Contrary', password='xxg527')>
<User(name='fred', fullname='Fred Flinstone', password='blah')>

Basic operations with Query include issuing LIMIT and OFFSET, most conveniently using Python array slices and typically in conjunction with ORDER BY:

>>> for u in session.query(User).order_by(User.id)[1:3]:
...    print(u)
<User(name='wendy', fullname='Wendy Williams', password='foobar')>
<User(name='mary', fullname='Mary Contrary', password='xxg527')>
and filtering results, which is accomplished either with filter_by(), which uses keyword arguments:

>>> for name, in session.query(User.name).\
...             filter_by(fullname='Ed Jones'):
...    print(name)
ed

>>> for name, in session.query(User.name).\
...             filter(User.fullname=='Ed Jones'):
...    print(name)
ed

The Query object is fully generative, meaning that most method calls return a new Query object upon which further criteria may be added. For example, to query for users named “ed” with a full name of “Ed Jones”, you can call filter() twice, which joins criteria using AND:

>>> for user in session.query(User).\
...          filter(User.name=='ed').\
...          filter(User.fullname=='Ed Jones'):
...    print(user)
<User(name='ed', fullname='Ed Jones', password='f8s7ccs')>
Common Filter Operators

下面列出了filter()最常用的一些operators

equals:
query.filter(User.name == 'ed')

not equals:
query.filter(User.name != 'ed')

LIKE:
query.filter(User.name.like('%ed%'))

IN:
query.filter(User.name.in_(['ed', 'wendy', 'jack']))

# works with query objects too:
query.filter(User.name.in_(
        session.query(User.name).filter(User.name.like('%ed%'))
))

NOT IN:
query.filter(User.name.in_(['ed', 'wendy', 'jack']))

IS NULL:
query.filter(User.name == None)

# alternatively, if pep8/linters are a concern
query.filter(User.name.is_(None))

IS NOT NULL:
query.filter(User.name != None)

# alternatively, if pep8/linters are a concern
query.filter(User.name.isnot(None))

AND:
# use and_()
from sqlalchemy import and_
query.filter(and_(User.name == 'ed', User.fullname == 'Ed Jones'))

# or send multiple expressions to .filter()
query.filter(User.name == 'ed', User.fullname == 'Ed Jones')

# or chain multiple filter()/filter_by() calls
query.filter(User.name == 'ed').filter(User.fullname == 'Ed Jones')

Note
Make sure you use and_() and not the Python and operator!

OR:
from sqlalchemy import or_
query.filter(or_(User.name == 'ed', User.name == 'wendy'))

Note
Make sure you use or_() and not the Python or operator!

MATCH:
query.filter(User.name.match('wendy'))

Note
match() uses a database-specific MATCH or CONTAINS function;
its behavior will vary by backend and is not available on some backends such as SQLite.

Building a Relationship

创建对象与对象之间的关系,下面我们新建一个Address表,下面的操作相比django的orm繁琐一些,要同时在两个class内部同时设置relationship

>>> from sqlalchemy import ForeignKey
>>> from sqlalchemy.orm import relationship

>>> class Address(Base):
...     __tablename__ = 'addresses'
...     id = Column(Integer, primary_key=True)
...     email_address = Column(String(50), nullable=False)
...     user_id = Column(Integer, ForeignKey('users.id'))
...
...     user = relationship("User", back_populates="addresses") # 将地址表和用户表关联
...
...     def __repr__(self):
...         return "<Address(email_address='%s')>" % self.email_address


# 在用户表中还要重新设置一次
>>> User.addresses = relationship(
...     "Address", order_by=Address.id, back_populates="user")

>>> Base.metadata.create_all(engine)

Working with Related Objects

现在我们创建了一个User,与它对应的一个空addresses集合也将创立。集合类型可以是各种合法类型,比如set/dictionaries(see Customizing Collection Access for details),但是默认集合是一个list。

现在我们再来创建一个用户Jack

>>> jack = User(name='jack', fullname='Jack Bean', password='gjffdd')
>>> jack.addresses
[]

We are free to add Address objects on our User object. In this case we just assign a full list directly:

现在我们将用户Jack和一些地址关联起来

>>> jack.addresses = [
...                 Address(email_address='jack@google.com'),
...                 Address(email_address='j25@yahoo.com')]

When using a bidirectional relationship, elements added in one direction automatically become visible in the other direction. This behavior occurs based on attribute on-change events and is evaluated in Python, without using any SQL:
现在可以通过地址对象访问用户对象了

>>> jack.addresses[1]
<Address(email_address='j25@yahoo.com')>

>>> jack.addresses[1].user
<User(name='jack', fullname='Jack Bean', password='gjffdd')>

Let’s add and commit Jack Bean to the database. jack as well as the two Address members in the corresponding addresses collection are both added to the session at once, using a process known as cascading:

接下来commit保存到数据库

>>> session.add(jack)
>>> session.commit()
sqlalchemy.engine.base.Engine INSERT INTO addresses (email_address, user_id) VALUES (%s, %s)
sqlalchemy.engine.base.Engine ('jack@google.com', 5L)
sqlalchemy.engine.base.Engine INSERT INTO addresses (email_address, user_id) VALUES (%s, %s)
sqlalchemy.engine.base.Engine ('j25@yahoo.com', 5L)
sqlalchemy.engine.base.Engine COMMIT

Querying for Jack, we get just Jack back. No SQL is yet issued for Jack’s addresses:

>>> jack = session.query(User).\
... filter_by(name='jack').one()
>>> jack
<User(name='jack', fullname='Jack Bean', password='gjffdd')>
Let’s look at the addresses collection. Watch the SQL:

>>> jack.addresses
[<Address(email_address='jack@google.com')>, <Address(email_address='j25@yahoo.com')>]

When we accessed the addresses collection, SQL was suddenly issued. This is an example of a lazy loading relationship. The addresses collection is now loaded and behaves just like an ordinary list. We’ll cover ways to optimize the loading of this collection in a bit.

Delete

删除操作,接下来我们尝试删除jack对象,注意地址对象并不会因此而删除

>>> session.delete(jack)
>>> session.query(User).filter_by(name='jack').count()
0
So far, so good. How about Jack’s Address objects ?

>>> session.query(Address).filter(
...     Address.email_address.in_(['jack@google.com', 'j25@yahoo.com'])
...  ).count()
2

Uh oh, they’re still there ! Analyzing the flush SQL, we can see that the user_id column of each address was set to NULL, but the rows weren’t deleted. SQLAlchemy doesn’t assume that deletes cascade, you have to tell it to do so. Configuring delete/delete-orphan Cascade. We will configure cascade options on the User.addresses relationship to change the behavior. While SQLAlchemy allows you to add new attributes and relationships to mappings at any point in time, in this case the existing relationship needs to be removed, so we need to tear down the mappings completely and start again – we’ll close the Session:

直接close来rollback,并不进行commit

>>> session.close()
ROLLBACK

Use a new declarative_base():

>>> Base = declarative_base()

Next we’ll declare the User class, adding in the addresses relationship
including the cascade configuration (we’ll leave the constructor out too):

>>> class User(Base):
...     __tablename__ = 'users'
...
...     id = Column(Integer, primary_key=True)
...     name = Column(String(50))
...     fullname = Column(String(50))
...     password = Column(String(50))
...
...     addresses = relationship("Address", back_populates='user',
...                     cascade="all, delete, delete-orphan")
...
...     def __repr__(self):
...        return "<User(name='%s', fullname='%s', password='%s')>" % (
...                                self.name, self.fullname, self.password)

Then we recreate Address, noting that in this case
we’ve created the Address.user relationship via the User class already:

>>> class Address(Base):
...     __tablename__ = 'addresses'
...     id = Column(Integer, primary_key=True)
...     email_address = Column(String(50), nullable=False)
...     user_id = Column(Integer, ForeignKey('users.id'))
...     user = relationship("User", back_populates="addresses")
...
...     def __repr__(self):
...         return "<Address(email_address='%s')>" % self.email_address

Now when we load the user jack (below using get(), which loads by primary key), removing an address from the corresponding addresses collection will result in that Address being deleted:

# load Jack by primary key
>>> jack = session.query(User).get(5)

# remove one Address (lazy load fires off)
>>> del jack.addresses[1]

# only one address remains
>>> session.query(Address).filter(
...     Address.email_address.in_(['jack@google.com', 'j25@yahoo.com'])
... ).count()
1

Deleting Jack will delete both Jack and the remaining Address associated with the user:

>>> session.delete(jack)

>>> session.query(User).filter_by(name='jack').count()
0

>>> session.query(Address).filter(
...    Address.email_address.in_(['jack@google.com', 'j25@yahoo.com'])
... ).count()
0

Further detail on configuration of cascades is at Cascades. The cascade functionality can also integrate smoothly with the ON DELETE CASCADE functionality of the relational database. See Using Passive Deletes for details.

backref

上面同时设置两个relationship太麻烦了,可以使用backref

from sqlalchemy import Integer, ForeignKey, String, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String)

    addresses = relationship("Address", backref="user")

class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    email = Column(String)
    user_id = Column(Integer, ForeignKey('user.id'))

The above configuration establishes a collection of Address objects on User called User.addresses. It also establishes a .user attribute on Address which will refer to the parent User object.

In fact, the backref keyword is only a common shortcut for placing a second relationship() onto the Address mapping, including the establishment of an event listener on both sides which will mirror attribute operations in both directions. The above configuration is equivalent to:

rom sqlalchemy import Integer, ForeignKey, String, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String)

    addresses = relationship("Address", back_populates="user")
        
class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    email = Column(String)
    user_id = Column(Integer, ForeignKey('user.id'))

    user = relationship("User", back_populates="addresses")

Above, we add a .user relationship to Address explicitly. On both relationships, the back_populates directive tells each relationship about the other one, indicating that they should establish “bidirectional” behavior between each other. The primary effect of this configuration is that the relationship adds event handlers to both attributes which have the behavior of “when an append or set event occurs here, set ourselves onto the incoming attribute using this particular attribute name”. The behavior is illustrated as follows. Start with a User and an Address instance. The .addresses collection is empty, and the .user attribute is None:

>>> u1 = User()
>>> a1 = Address()
>>> u1.addresses
[]
>>> print(a1.user)
None

However, once the Address is appended to the u1.addresses collection, both the collection and the scalar attribute have been populated:

>>> u1.addresses.append(a1)
>>> u1.addresses
[<__main__.Address object at 0x12a6ed0>]
>>> a1.user
<__main__.User object at 0x12a6590>

This behavior of course works in reverse for removal operations as well, as well as for equivalent operations on both sides. Such as when .user is set again to None, the Address object is removed from the reverse collection:

>>> a1.user = None
>>> u1.addresses
[]

The manipulation of the .addresses collection and the .user attribute occurs entirely in Python without any interaction with the SQL database. Without this behavior, the proper state would be apparent on both sides once the data has been flushed to the database, and later reloaded after a commit or expiration operation occurs. The backref/back_populates behavior has the advantage that common bidirectional operations can reflect the correct state without requiring a database round trip.

Remember, when the backref keyword is used on a single relationship, it’s exactly the same as if the above two relationships were created individually using back_populates on each.

mysql操作

检验一下我们上面的成果以及熟悉创建的mysql表的结构

地址表的结构

> SHOW CREATE TABLE addresses;
+-----------+----------------+
| Table     | Create Table   |
|-----------+----------------|
| addresses | CREATE TABLE `addresses` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `email_address` varchar(50) NOT NULL,
  `user_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `user_id` (`user_id`),
  CONSTRAINT `addresses_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8                |
+-----------+----------------+
1 row in set
Time: 0.005s

> DESC addresses;
+---------------+-------------+--------+-------+-----------+----------------+
| Field         | Type        | Null   | Key   |   Default | Extra          |
|---------------+-------------+--------+-------+-----------+----------------|
| id            | int(11)     | NO     | PRI   |    <null> | auto_increment |
| email_address | varchar(50) | NO     |       |    <null> |                |
| user_id       | int(11)     | YES    | MUL   |    <null> |                |
+---------------+-------------+--------+-------+-----------+----------------+
3 rows in set
Time: 0.002s

用户表的结构

> SHOW CREATE TABLE users;
+---------+----------------+
| Table   | Create Table   |
|---------+----------------|
| users   | CREATE TABLE `users` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(50) DEFAULT NULL,
  `fullname` varchar(50) DEFAULT NULL,
  `password` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8                |
+---------+----------------+
1 row in set
Time: 0.002s

> DESC users;
+----------+-------------+--------+-------+-----------+----------------+
| Field    | Type        | Null   | Key   |   Default | Extra          |
|----------+-------------+--------+-------+-----------+----------------|
| id       | int(11)     | NO     | PRI   |    <null> | auto_increment |
| name     | varchar(50) | YES    |       |    <null> |                |
| fullname | varchar(50) | YES    |       |    <null> |                |
| password | varchar(50) | YES    |       |    <null> |                |
+----------+-------------+--------+-------+-----------+----------------+
4 rows in set
Time: 0.003s

详细数据

> SELECT * FROM addresses;
+------+-----------------+-----------+
|   id | email_address   |   user_id |
|------+-----------------+-----------|
|    3 | jack@google.com |         5 |
|    4 | j25@yahoo.com   |         5 |
+------+-----------------+-----------+
2 rows in set
Time: 0.002s

> SELECT * FROM users;
+------+--------+----------------+------------+
|   id | name   | fullname       | password   |
|------+--------+----------------+------------|
|    1 | ed     | Ed Jones       | f8s7ccs    |
|    2 | wendy  | Wendy Williams | foobar     |
|    3 | mary   | Mary Contrary  | xxg527     |
|    4 | fred   | Fred Flinstone | blah       |
|    5 | jack   | Jack Bean      | gjffdd     |
+------+--------+----------------+------------+
5 rows in set
Time: 0.003s

知乎live设计模型

from sqlalchemy import Column, String, Integer, create_engine, SmallInteger
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base

DB_URI = 'sqlite:///user.db'
Base = declarative_base()
engine = create_engine(DB_URI)
Base.metadata.bind = engine
Session = sessionmaker(bind=engine)
session = Session()
class User(Base):
    __tablename__ = 'live_user'
    
    id = Column(Integer, unique=True, primary_key=True, autoincrement=True)
    speaker_id = Column(String(40), index=True, unique=True)
    name = Column(String(40), index=True, nullable=False)
    gender = Column(SmallInteger, default=2)
    headline = Column(String(200))
    avatar_url = Column(String(100), nullable=False)
    bio = Column(String(200))
    description = Column(String())
    
    @classmethod
    def add(cls, **kwargs):
        speaker_id = kwargs.get('speaker_id', None)
        if id is not None:
            r = session.query(cls).filter_by(speaker_id=speaker_id).first()
            if r:
                return r
        try:
            r = cls(**kwargs)
            session.add(r)
            session.commit()
        except:
            session.rollback()
            raise
        else:
            return r
Base.metadata.create_all()

接口分为2种:

  1. https://api.zhihu.com/lives/o… (未结束)

  2. https://api.zhihu.com/lives/e… (已结束)

elasticsearch-dsl-py相比elasticsearch-py做了各种封装,DSL也支持用类代表一个doc_type(类似数据库中的Table),实现ORM的效果。我们就用它来写Live模型:

from elasticsearch_dsl import DocType, Date, Integer, Text, Float, Boolean
from elasticsearch_dsl.connections import connections
from elasticsearch_dsl.query import SF, Q
from config import SEARCH_FIELDS
from .speaker import User, session

connections.create_connection(hosts=['localhost'])
class Live(DocType):
    id = Integer()
    speaker_id = Integer()
    feedback_score = Float() # 评分
    topic_names = Text(analyzer='ik_max_word')  # 话题标签名字
    seats_taken = Integer()  # 参与人数
    subject = Text(analyzer='ik_max_word')  # 标题
    amount = Float()  # 价格(RMB)
    description = Text(analyzer='ik_max_word')
    status = Boolean()  # public(True)/ended(False)
    starts_at = Date()
    outline = Text(analyzer='ik_max_word')  # Live内容
    speaker_message_count = Integer()
    tag_names = Text(analyzer='ik_max_word')
    liked_num = Integer()
    
    class Meta:
        index = 'live'
        
    @classmethod
    def add(cls, **kwargs):
        id = kwargs.pop('id', None)
        if id is None:
            return False
        live = cls(meta={'id': id}, **kwargs)
        live.save()
        return live

它允许我们用一种非常可维护的方法来组织字典:

In : from elasticsearch_dsl.query import Q
In : Q('multi_match', subject='python').to_dict()
Out: {'multi_match': {'subject': 'python'}}
In : from elasticsearch import Elasticsearch
In : from elasticsearch_dsl import Search, Q
In : s = Search(using=client, index='live')
In : s = s.query('match', subject='python').query(~Q('match', description='量化'))
In : s.execute()
Out: <Response: [<Hit(live/live/789840559912009728): {'subject': 'Python 工程师的入门和进阶', 'feedback_score': 4.5, 'stat...}>]>

上述例子表示从live这个索引(类似数据库中的Database)中找到subject字典包含python,但是description字段不包含量化的Live。