forked from UWPCE-PythonCert/ProgrammingInPython
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathGraphDatabases.html
More file actions
276 lines (260 loc) · 22.3 KB
/
GraphDatabases.html
File metadata and controls
276 lines (260 loc) · 22.3 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
<!DOCTYPE html>
<html class="writer-html5" lang="en" >
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Graph Databases — Programming in Python 7.0 documentation</title>
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<!--[if lt IE 9]>
<script src="../_static/js/html5shiv.min.js"></script>
<![endif]-->
<script data-url_root="../" id="documentation_options" src="../_static/documentation_options.js"></script>
<script src="../_static/jquery.js"></script>
<script src="../_static/underscore.js"></script>
<script src="../_static/doctools.js"></script>
<script src="../_static/js/theme.js"></script>
<link rel="index" title="Index" href="../genindex.html" />
<link rel="search" title="Search" href="../search.html" />
<link rel="next" title="Concurrent Programming" href="Concurrency.html" />
<link rel="prev" title="No SQL Databases" href="NoSQL.html" />
</head>
<body class="wy-body-for-nav">
<div class="wy-grid-for-nav">
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-scroll">
<div class="wy-side-nav-search" style="background: #4b2e83" >
<a href="../index.html">
<img src="../_static/UWPCE_logo_full.png" class="logo" alt="Logo"/>
</a>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<p class="caption" role="heading"><span class="caption-text">Topics in the Program</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../topics/01-setting_up/index.html">1. Setting up your Environment</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/02-basic_python/index.html">2. Basic Python</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/03-recursion_booleans/index.html">3. Booleans and Recursion</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/04-sequences_iteration/index.html">4. Sequences and Iteration</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/05-text_handling/index.html">5. Basic Text Handling</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/06-exceptions/index.html">6. Exception Handling</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/07-unit_testing/index.html">7. Unit Testing</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/08-dicts_sets/index.html">8. Dictionaries and Sets</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/09-files/index.html">9. File Handling</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/10-modules_packages/index.html">10. Modules and Packages</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/11-argument_passing/index.html">11. Advanced Argument Passing</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/12-comprehensions/index.html">12. Comprehensions</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/13-intro_oo/index.html">13. Intro to Object Oriented Programing</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/14-magic_methods/index.html">14. Properties and Magic Methods</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/15-subclassing/index.html">15. Subclassing and Inheritance</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/16-multiple_inheritance/index.html">16. Multiple Inheritance</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/17-functional_programming/index.html">17. Introduction to Functional Programming</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/18-advanced_testing/index.html">18. Advanced Testing</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="../topics/99-extras/index.html">19. Extra Topics</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="Pep8.html">Coding Style and Linting</a></li>
<li class="toctree-l2"><a class="reference internal" href="CodeReviews.html">Code Reviews</a></li>
<li class="toctree-l2"><a class="reference internal" href="PersistanceAndSerialization.html">Persistence and Serialization</a></li>
<li class="toctree-l2"><a class="reference internal" href="Unicode.html">Unicode in Python</a></li>
<li class="toctree-l2"><a class="reference internal" href="IteratorsAndGenerators.html">Iterators and Generators</a></li>
<li class="toctree-l2"><a class="reference internal" href="Decorators.html">Decorators</a></li>
<li class="toctree-l2"><a class="reference internal" href="../exercises/mailroom/mailroom-decorator.html">Mailroom – Decoratoring it</a></li>
<li class="toctree-l2"><a class="reference internal" href="ContextManagers.html">Context Managers</a></li>
<li class="toctree-l2"><a class="reference internal" href="../exercises/context-managers-exercise.html">A Couple Handy Context Managers</a></li>
<li class="toctree-l2"><a class="reference internal" href="MetaProgramming.html">Metaprogramming</a></li>
<li class="toctree-l2"><a class="reference internal" href="../exercises/mailroom/mailroom-meta.html">Mailroom – metaprogramming it!</a></li>
<li class="toctree-l2"><a class="reference internal" href="Logging.html">Logging and the logging module</a></li>
<li class="toctree-l2"><a class="reference internal" href="Debugging.html">Debugging</a></li>
<li class="toctree-l2"><a class="reference internal" href="NoSQL.html">No SQL Databases</a></li>
<li class="toctree-l2 current"><a class="current reference internal" href="#">Graph Databases</a></li>
<li class="toctree-l2"><a class="reference internal" href="Concurrency.html">Concurrent Programming</a></li>
<li class="toctree-l2"><a class="reference internal" href="Async.html">Asychronous Programming</a></li>
<li class="toctree-l2"><a class="reference internal" href="Coroutines.html">Notes on Coroutines</a></li>
<li class="toctree-l2"><a class="reference internal" href="ThreadingMultiprocessing.html">Threading and multiprocessing</a></li>
<li class="toctree-l2"><a class="reference internal" href="../exercises/threaded_downloader.html">Threaded Web Scraper</a></li>
<li class="toctree-l2"><a class="reference internal" href="Profiling.html">Performance and Profiling</a></li>
</ul>
</li>
</ul>
</div>
</div>
</nav>
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" style="background: #4b2e83" >
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="../index.html">Programming in Python</a>
</nav>
<div class="wy-nav-content">
<div class="rst-content style-external-links">
<div role="navigation" aria-label="Page navigation">
<ul class="wy-breadcrumbs">
<li><a href="../index.html" class="icon icon-home"></a> »</li>
<li><a href="../topics/99-extras/index.html"><span class="section-number">19. </span>Extra Topics</a> »</li>
<li>Graph Databases</li>
<li class="wy-breadcrumbs-aside">
<a href="../_sources/modules/GraphDatabases.rst.txt" rel="nofollow"> View page source</a>
</li>
</ul><div class="rst-breadcrumbs-buttons" role="navigation" aria-label="Sequential page navigation">
<a href="NoSQL.html" class="btn btn-neutral float-left" title="No SQL Databases" accesskey="p"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="Concurrency.html" class="btn btn-neutral float-right" title="Concurrent Programming" accesskey="n">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>
<hr/>
</div>
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="graph-databases">
<span id="id1"></span><h1>Graph Databases<a class="headerlink" href="#graph-databases" title="Permalink to this headline"></a></h1>
<p>In computing, a graph database is a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data.</p>
<p><a class="reference external" href="https://en.wikipedia.org/wiki/Graph_database">https://en.wikipedia.org/wiki/Graph_database</a></p>
<p>For those of you that aren’t familiar with the mathematical concept of a “graph” – what all this means is that the database itself stores not just the data, but the relationships between the data.</p>
<p>This is in contrast to RDBMSs, where the data are stored in individual tables, with the relationships between the tables maintained via primary and foreign keys, and the actual relationships determined on the fly by searching multiple tables during “join” queries. RDBMSs are well optimized for these kinds of queries, but Graph Databases can be much more efficient for data retrieval when the records have complex relationships.</p>
<p>I find it a bit ironic that “relational” databases, aren’t directly storing the relationships :-)</p>
<p>The wikipedia page has a pretty good description / example of how that works.</p>
<p>There are a number of commercial and open source Graph databases out there, and more than a few have Python drivers.</p>
<div class="section" id="neo4j">
<h2>neo4j<a class="headerlink" href="#neo4j" title="Permalink to this headline"></a></h2>
<p><a class="reference external" href="https://neo4j.com/">neo4j</a> is perhaps the most <a class="reference external" href="https://db-engines.com/en/ranking/graph+dbms">popular</a> graph database as of this writing, and it comes with a Python driver and good documentation, so we’ll use that one for examples.</p>
<p>Here is a nice Python based tutorial about graph databases and neo4j:</p>
<p><a class="reference external" href="https://medium.com/labcodes/graph-databases-talking-about-your-data-relationships-with-python-b438c689dc89">Talking About your Data Relationships</a></p>
<p>And <a class="reference external" href="https://neo4j.com/developer/python/">here are the docs</a> for the python driver.</p>
<p>And the Python API documentation: <a class="reference external" href="https://neo4j.com/docs/api/python-driver/current/">python-driver API</a></p>
<p>There are a lot of other great docs and tutorial on the neo4j web site – well worth checking out if you want to really learn how to use it.</p>
<p>And here is the “official” <a class="reference download internal" download="" href="../_downloads/1cf25777c3ecfb8cba104487ed28d1d1/neo4j-developer-manual-3.3-python.pdf"><code class="xref download docutils literal notranslate"><span class="pre">neo4j</span> <span class="pre">developer</span> <span class="pre">manual:</span> <span class="pre">Python</span></code></a></p>
</div>
<div class="section" id="neo4j-example">
<h2>neo4j example<a class="headerlink" href="#neo4j-example" title="Permalink to this headline"></a></h2>
<div class="section" id="setup">
<h3>Setup<a class="headerlink" href="#setup" title="Permalink to this headline"></a></h3>
<p>When we use databases, we introduce additional setup that we must perform before we can create and access our data. These setup activities can get really complex, and in reality, when developing software professionally, we may find that the setup is performed by someone other than a developer – that’s what sys admins are for :-).</p>
<p>To allow us to focus on Python development we are going to use the simplest way possible to get a database running, that will work whether you use Linux, MacOS or Windows.</p>
<p>Here, we’ll talk you through the steps to get Neo4j working.</p>
<p>What are we going to do?</p>
<ul class="simple">
<li><p>First we’ll sign up for a free, online Neo4j account.</p></li>
<li><p>Then, we’ll configure the online database so we can start developing.</p></li>
<li><p>Next, we’ll make sure we have secure access to our database.</p></li>
<li><p>Final step is to install the requisite Python modules.</p></li>
</ul>
<p>At that point, Python development can commence!</p>
<p>Let’s get started.</p>
<div class="section" id="graphendb">
<h4>GraphenDB<a class="headerlink" href="#graphendb" title="Permalink to this headline"></a></h4>
<p><a class="reference external" href="https://www.graphenedb.com/">GraphenDB</a> is a hosting service for neo4j databases. They provide free “sandbox” accounts for small databases you can use to test and learn how to use it.</p>
<p>The getting started guide is here: <a class="reference external" href="https://docs.graphenedb.com/docs/getting-started">Getting Started</a></p>
</div>
<div class="section" id="getting-an-account">
<h4>Getting an account:<a class="headerlink" href="#getting-an-account" title="Permalink to this headline"></a></h4>
<ol class="arabic simple">
<li><p>Go to <a class="reference external" href="https://www.graphenedb.com/">https://www.graphenedb.com/</a></p></li>
<li><p>Click on the “Sign up” button in the upper right to get signed up for an account.</p></li>
<li><p>Once you have created an account, you need to create a database. You can create a small (but not that small!) “Hobby” database for free.</p></li>
<li><p>Once you create the database, it will create a username (the name of the database you gave it) and generated password. Be sure to record your user name and password.</p></li>
</ol>
<p>Note that when your database is set up, you also get connection strings for both “bolt” and http REST interfaces. Originally designed for neo4j, Bolt is a highly efficient, lightweight client-server protocol designed for database applications.</p>
<p><a class="reference external" href="https://boltprotocol.org/">https://boltprotocol.org/</a></p>
</div>
<div class="section" id="managing-your-password">
<h4>Managing your password:<a class="headerlink" href="#managing-your-password" title="Permalink to this headline"></a></h4>
<p>We always have to sign on to our network database, using our user name and password. That means these credentials must be accessible to our Python program. But we must make sure that our password is secure. If we check code containing the password in to github, it will give access to anyone who reads our repo. With many online services, that will incur costs for which we would be responsible.</p>
<p>But don’t worry, we can guard against that easily. Here’s how:</p>
<p>First, edit your <code class="docutils literal notranslate"><span class="pre">.gitignore</span></code> file and add the following 2 lines at the end of the file, exactly as shown:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="c1"># secrets</span>
<span class="o">.</span><span class="n">config</span><span class="o">/</span>
</pre></div>
</div>
<p>This will ensure that you don’t accidentally add your password to git.</p>
<p>NOTE: this still puts your password in plain text on your computer! So not really secure for really critical use!</p>
<p>Now, in the parent directory of your local project, make a new directory called <code class="docutils literal notranslate"><span class="pre">.config</span></code>. Note the leading period.</p>
<p>In the newly create <code class="docutils literal notranslate"><span class="pre">.config</span></code> directory create a new file called <code class="docutils literal notranslate"><span class="pre">config</span></code>. Note no leading period.</p>
<p>Edit the <code class="docutils literal notranslate"><span class="pre">config</span></code> file using your preferred editor, creating lines as follows:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[</span><span class="n">configuration</span><span class="p">]</span>
<span class="n">neo4juser</span> <span class="o">=</span>
<span class="n">neo4jpw</span> <span class="o">=</span>
</pre></div>
</div>
<p>At the end of the lines, enter a space after the =, and then the user name and password created in step 4. above. Your config file will look something like this:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[</span><span class="n">configuration</span><span class="p">]</span>
<span class="n">neo4juser</span> <span class="o">=</span> <span class="n">example1</span>
<span class="n">neo4jpw</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">wJRVveeeg9LL</span><span class="o">.</span><span class="n">CyWKF4RbGf2SWTKp</span>
</pre></div>
</div>
<p>Save that config file</p>
<p>Your user name and password are now safely stored where Python can access them. The <code class="docutils literal notranslate"><span class="pre">.gitignore</span></code> change will prevent the <code class="docutils literal notranslate"><span class="pre">.config</span></code> files from being accidentally pushed to github.</p>
<p>So now we need to setup access to Neo4j from Python. To do that we need to install the neo4j driver, which wires up Python to Neo4j.</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>pip install neo4j-driver
</pre></div>
</div>
<p>Now, we are ready to start using our database!</p>
</div>
</div>
<div class="section" id="cypher">
<h3>Cypher<a class="headerlink" href="#cypher" title="Permalink to this headline"></a></h3>
<p>Neo4j uses a query language called Cypher. It plays the same role as SQL for RDBMSs – and the official driver uses it to “talk” to the database.</p>
<p><a class="reference external" href="https://neo4j.com/developer/cypher-query-language/">https://neo4j.com/developer/cypher-query-language/</a></p>
<p>And here is a nice introduction:</p>
<p><a class="reference external" href="https://www.airpair.com/neo4j/posts/getting-started-with-neo4j-and-cypher">https://www.airpair.com/neo4j/posts/getting-started-with-neo4j-and-cypher</a></p>
<div class="section" id="quick-test">
<h4>Quick test<a class="headerlink" href="#quick-test" title="Permalink to this headline"></a></h4>
<p>You can find example code in the class repo in:</p>
<p>IntroPython-2017/examples/nosql/neo4j</p>
<p>We are going to play with the code to get a feel for neo4j.</p>
</div>
</div>
</div>
<div class="section" id="other-interfaces-for-neo4j">
<h2>Other interfaces for neo4j<a class="headerlink" href="#other-interfaces-for-neo4j" title="Permalink to this headline"></a></h2>
<p>neo4j-client is the default Python interface developed by the neo4j team. There are other options:</p>
<div class="section" id="neomodel">
<h3>neomodel<a class="headerlink" href="#neomodel" title="Permalink to this headline"></a></h3>
<p>Is a Django ORM-like Object Mapper for neo4j</p>
<p><a class="reference external" href="http://neomodel.readthedocs.io/en/latest/">http://neomodel.readthedocs.io/en/latest/</a></p>
</div>
<div class="section" id="py2neo">
<h3>Py2neo<a class="headerlink" href="#py2neo" title="Permalink to this headline"></a></h3>
<p>Py2neo is a client library and toolkit for working with Neo4j from within Python applications and from the command line. The core library has no external dependencies and has been carefully designed to be easy and intuitive to use.</p>
<p>It “speaks” the bolt protocol directly.</p>
<p><a class="reference external" href="http://py2neo.org/v3/">http://py2neo.org/v3/</a></p>
</div>
</div>
<div class="section" id="a-bit-more-about-graphs">
<h2>A bit more about Graphs<a class="headerlink" href="#a-bit-more-about-graphs" title="Permalink to this headline"></a></h2>
<p>Graph data structures can be very useful for certain catagories of problems:</p>
<p>If you Google something like: “applications of graph data structure in computer science” you will get a lot of pages to explore, like this one:</p>
<p><a class="reference external" href="http://www.cs.cmu.edu/afs/cs/academic/class/15210-s14/www/lectures/graphs.pdf">http://www.cs.cmu.edu/afs/cs/academic/class/15210-s14/www/lectures/graphs.pdf</a></p>
<p>I encourage you to read up about them.</p>
<p>If you do find a use-case, or simply want to explore the topic experimentally with Python, the main package for working with graphs in python is <cite>networkx</cite>:</p>
<p><a class="reference external" href="https://networkx.github.io/">https://networkx.github.io/</a></p>
<p>It provides a pretty fully featured set of graph data structures, and the common algorithms for manipulating and exploring them.</p>
<p>There is even a package for storing networkx graphs in neo4j:</p>
<p><a class="reference external" href="https://neonx.readthedocs.io">https://neonx.readthedocs.io</a></p>
<p>So you can store your graph in the neo4j database, and work with it with networkx. This may even give you a nicer, more pythonic interface to neo4j.</p>
</div>
</div>
</div>
</div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="NoSQL.html" class="btn btn-neutral float-left" title="No SQL Databases" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="Concurrency.html" class="btn btn-neutral float-right" title="Concurrent Programming" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>
<hr/>
<div role="contentinfo">
<p>© Copyright 2020, University of Washington, Natasha Aleksandrova, Christopher Barker, Brian Dorsey, Cris Ewing, Christy Heaton, Jon Jacky, Maria McKinley, Andy Miles, Rick Riehle, Joseph Schilz, Joseph Sheedy, Hosung Song. Creative Commons Attribution-ShareAlike 4.0 license.</p>
</div>
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script>
jQuery(function () {
SphinxRtdTheme.Navigation.enable(true);
});
</script>
</body>
</html>