Overview
之前项目的人遗留的数据散落在多种数据库中,既有MySQL
,MongoDB
,也有Cassandra
和PostgreSQL
。在Python3
版本的jupyter
中连接PostgreSQL
需要安装psycopg2
,而psycopg2
在Python2
中则是已经集成好的。
1. 安装python3-psycopg2
和libpq-dev
先在Linux上安装好这两个包,
1 2 | sudo apt-get install python3-psycopg2 sudo apt-get install libpq-dev |
再安装psycopg2
,
1 | pip3 install psycopg2 |
否则就会报错如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | ERROR: Command errored out with exit status 1: command : /home/yangbingjiao/anaconda3/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = ' "'" ' /tmp/pip-install-wxj7s4lx/psycopg2/setup .py '"' " '; __file__=' "'" ' /tmp/pip-install-wxj7s4lx/psycopg2/setup .py '"' " ';f=getattr(tokenize, ' "'" ' open '"' " ', open)(__file__);code=f.read().replace(' "'" '\r\n '"' " ', ' "'" '\n '"' " ');f.close();exec(compile(code, __file__, ' "'" ' exec '"' " '))' egg_info --egg-base pip-egg-info cwd: /tmp/pip-install-wxj7s4lx/psycopg2/ Complete output (23 lines): running egg_info creating pip-egg-info /psycopg2 .egg-info writing pip-egg-info /psycopg2 .egg-info /PKG-INFO writing dependency_links to pip-egg-info /psycopg2 .egg-info /dependency_links .txt writing top -level names to pip-egg-info /psycopg2 .egg-info /top_level .txt writing manifest file 'pip-egg-info/psycopg2.egg-info/SOURCES.txt' Error: pg_config executable not found. pg_config is required to build psycopg2 from source . Please add the directory containing pg_config to the $PATH or specify the full executable path with the option: python setup.py build_ext --pg-config /path/to/pg_config build ... or with the pg_config option in 'setup.cfg' . If you prefer to avoid building psycopg2 from source , please install the PyPI 'psycopg2-binary' package instead. For further information please check the 'doc/src/install.rst' file (also at <http: //initd .org /psycopg/docs/install .html>). ---------------------------------------- ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output. |
2.连接PostgreSQL
数据库
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | import psycopg2 import pandas as pd conn = psycopg2.connect(database = "database" , user = "user" , password = "password" , host = "192.168.1.230" , port = "5432" ) cur = conn.cursor() pql = """ SELECT * FROM database.table WHERE create_time >= '2019-09-23' AND create_time < '2019-09-30' order by create_time limit 10 """ cur.execute(pql) rows = cur.fetchall() df = pd.DataFrame(rows) conn.close() |
本文主要参考了以下文章:
Python3连接PostgreSQL(10.5)数据库