-
Notifications
You must be signed in to change notification settings - Fork 11
Expand file tree
/
Copy pathFiles.html
More file actions
436 lines (403 loc) · 38.9 KB
/
Files.html
File metadata and controls
436 lines (403 loc) · 38.9 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
<!DOCTYPE html>
<html class="writer-html5" lang="en" >
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>File Reading and Writing — Programming in Python 7.0 documentation</title>
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<!--[if lt IE 9]>
<script src="../_static/js/html5shiv.min.js"></script>
<![endif]-->
<script data-url_root="../" id="documentation_options" src="../_static/documentation_options.js"></script>
<script src="../_static/jquery.js"></script>
<script src="../_static/underscore.js"></script>
<script src="../_static/doctools.js"></script>
<script src="../_static/js/theme.js"></script>
<link rel="index" title="Index" href="../genindex.html" />
<link rel="search" title="Search" href="../search.html" />
<link rel="next" title="File Exercise" href="../exercises/file_processing/file_lab.html" />
<link rel="prev" title="9. File Handling" href="../topics/09-files/index.html" />
</head>
<body class="wy-body-for-nav">
<div class="wy-grid-for-nav">
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-scroll">
<div class="wy-side-nav-search" style="background: #4b2e83" >
<a href="../index.html">
<img src="../_static/UWPCE_logo_full.png" class="logo" alt="Logo"/>
</a>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<p class="caption" role="heading"><span class="caption-text">Topics in the Program</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../topics/01-setting_up/index.html">1. Setting up your Environment</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/02-basic_python/index.html">2. Basic Python</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/03-recursion_booleans/index.html">3. Booleans and Recursion</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/04-sequences_iteration/index.html">4. Sequences and Iteration</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/05-text_handling/index.html">5. Basic Text Handling</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/06-exceptions/index.html">6. Exception Handling</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/07-unit_testing/index.html">7. Unit Testing</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/08-dicts_sets/index.html">8. Dictionaries and Sets</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="../topics/09-files/index.html">9. File Handling</a><ul class="current">
<li class="toctree-l2 current"><a class="current reference internal" href="#">File Reading and Writing</a></li>
<li class="toctree-l2"><a class="reference internal" href="../exercises/file_processing/file_lab.html">File Exercise</a></li>
<li class="toctree-l2"><a class="reference internal" href="../exercises/file_processing/file_processing.html">File Processing</a></li>
<li class="toctree-l2"><a class="reference internal" href="../exercises/mailroom/mailroom_with_files.html">Mailroom With Files</a></li>
<li class="toctree-l2"><a class="reference internal" href="../exercises/trigrams/trigrams.html">Trigrams – Simple Text Manipulation</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../topics/10-modules_packages/index.html">10. Modules and Packages</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/11-argument_passing/index.html">11. Advanced Argument Passing</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/12-comprehensions/index.html">12. Comprehensions</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/13-intro_oo/index.html">13. Intro to Object Oriented Programing</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/14-magic_methods/index.html">14. Properties and Magic Methods</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/15-subclassing/index.html">15. Subclassing and Inheritance</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/16-multiple_inheritance/index.html">16. Multiple Inheritance</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/17-functional_programming/index.html">17. Introduction to Functional Programming</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/18-advanced_testing/index.html">18. Advanced Testing</a></li>
<li class="toctree-l1"><a class="reference internal" href="../topics/99-extras/index.html">19. Extra Topics</a></li>
</ul>
</div>
</div>
</nav>
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" style="background: #4b2e83" >
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="../index.html">Programming in Python</a>
</nav>
<div class="wy-nav-content">
<div class="rst-content style-external-links">
<div role="navigation" aria-label="Page navigation">
<ul class="wy-breadcrumbs">
<li><a href="../index.html" class="icon icon-home"></a> »</li>
<li><a href="../topics/09-files/index.html"><span class="section-number">9. </span>File Handling</a> »</li>
<li>File Reading and Writing</li>
<li class="wy-breadcrumbs-aside">
<a href="../_sources/modules/Files.rst.txt" rel="nofollow"> View page source</a>
</li>
</ul><div class="rst-breadcrumbs-buttons" role="navigation" aria-label="Sequential page navigation">
<a href="../topics/09-files/index.html" class="btn btn-neutral float-left" title="9. File Handling" accesskey="p"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="../exercises/file_processing/file_lab.html" class="btn btn-neutral float-right" title="File Exercise" accesskey="n">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>
<hr/>
</div>
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="file-reading-and-writing">
<span id="files"></span><h1>File Reading and Writing<a class="headerlink" href="#file-reading-and-writing" title="Permalink to this headline"></a></h1>
<p>Saving and loading data.</p>
<div class="section" id="id1">
<h2>Files<a class="headerlink" href="#id1" title="Permalink to this headline"></a></h2>
<div class="section" id="text-files">
<h3>Text Files<a class="headerlink" href="#text-files" title="Permalink to this headline"></a></h3>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'secrets.txt'</span><span class="p">)</span>
<span class="n">secret_data</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</pre></div>
</div>
<p><code class="docutils literal notranslate"><span class="pre">secret_data</span></code> is a string</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>In Python 3, files are opened by default in text mode, and the default encoding is UTF-8. This means that in the usual case, you get a proper Unicode string to work with, as UTF-8 is the most common encoding for text. Also, it is ASCII compatible, so ASCII Files with “just work”. IF “Unicode” and “ASCII” mean nothing to you – don’t worry about it, just know that things will usually work for text, even non-English text. And if you get odd characters or an <code class="docutils literal notranslate"><span class="pre">EncodingError</span></code>, then your file is not UTF-8, and it’s time to Google “Python Unicode”. (more info here: <a class="reference internal" href="Unicode.html#unicode"><span class="std std-ref">Unicode in Python</span></a>)</p>
</div>
</div>
<div class="section" id="binary-files">
<h3>Binary Files<a class="headerlink" href="#binary-files" title="Permalink to this headline"></a></h3>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'secrets.bin'</span><span class="p">,</span> <span class="s1">'rb'</span><span class="p">)</span>
<span class="n">secret_data</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</pre></div>
</div>
<p><code class="docutils literal notranslate"><span class="pre">secret_data</span></code> is a byte string (with arbitrary bytes in it – well, not arbitrary – whatever is in the file!)</p>
<p>(See the <code class="docutils literal notranslate"><span class="pre">struct</span></code> module to unpack binary data )</p>
</div>
<div class="section" id="file-opening-modes">
<h3>File Opening Modes<a class="headerlink" href="#file-opening-modes" title="Permalink to this headline"></a></h3>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'secrets.txt'</span><span class="p">,</span> <span class="p">[</span><span class="n">mode</span><span class="p">])</span>
<span class="s1">'r'</span><span class="p">,</span> <span class="s1">'w'</span><span class="p">,</span> <span class="s1">'a'</span>
<span class="s1">'rb'</span><span class="p">,</span> <span class="s1">'wb'</span><span class="p">,</span> <span class="s1">'ab'</span>
<span class="s1">'r+'</span><span class="p">,</span> <span class="s1">'w+'</span><span class="p">,</span> <span class="s1">'a+'</span>
<span class="s1">'r+b'</span><span class="p">,</span> <span class="s1">'w+b'</span><span class="p">,</span> <span class="s1">'a+b'</span>
</pre></div>
</div>
<p>These follow the Unix conventions, and aren’t all that well documented
in the Python docs. But these BSD docs make it pretty clear:</p>
<p><a class="reference external" href="http://www.manpagez.com/man/3/fopen/">http://www.manpagez.com/man/3/fopen/</a></p>
<p><strong>Gotcha</strong> – ‘w’ modes always clear the file if it already exists!</p>
</div>
<div class="section" id="text-file-notes">
<h3>Text File Notes<a class="headerlink" href="#text-file-notes" title="Permalink to this headline"></a></h3>
<p>Text is default:</p>
<blockquote>
<div><ul class="simple">
<li><p>Newlines are translated: <code class="docutils literal notranslate"><span class="pre">\r\n</span></code> -> <code class="docutils literal notranslate"><span class="pre">\n</span></code></p></li>
<li><p>– reading and writing!</p></li>
<li><p>Use *nix-style in your code: <code class="docutils literal notranslate"><span class="pre">\n</span></code></p></li>
</ul>
</div></blockquote>
<p>Gotcha:</p>
<blockquote>
<div><ul class="simple">
<li><p>no difference between text and binary on *nix</p></li>
<li><p>but this is not true on Windows, and will cause an error.</p></li>
</ul>
</div></blockquote>
</div>
<div class="section" id="file-reading">
<h3>File Reading<a class="headerlink" href="#file-reading" title="Permalink to this headline"></a></h3>
<p>Reading part of a file:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">header_size</span> <span class="o">=</span> <span class="mi">4096</span>
<span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'secrets.txt'</span><span class="p">)</span>
<span class="n">secret_header</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="n">header_size</span><span class="p">)</span>
<span class="n">secret_rest</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</pre></div>
</div>
</div>
<div class="section" id="common-idioms">
<h3>Common Idioms<a class="headerlink" href="#common-idioms" title="Permalink to this headline"></a></h3>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'secrets.txt'</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="n">line</span><span class="p">)</span>
</pre></div>
</div>
<p>(The file object is an iterable that iterates through the lines in a text file.)</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'secrets.txt'</span><span class="p">)</span>
<span class="k">while</span> <span class="kc">True</span><span class="p">:</span>
<span class="n">line</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">readline</span><span class="p">()</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">line</span><span class="p">:</span>
<span class="k">break</span>
<span class="n">do_something_with_line</span><span class="p">()</span>
</pre></div>
</div>
<p>We will learn more about the keyword <code class="docutils literal notranslate"><span class="pre">with</span></code> later (it creates a “context manager”), but for now, just understand the syntax and the advantage over simply opening the file:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'workfile'</span><span class="p">,</span> <span class="s1">'r'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">read_data</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">f</span><span class="o">.</span><span class="n">closed</span>
<span class="kc">True</span>
</pre></div>
</div>
<p>You use <code class="docutils literal notranslate"><span class="pre">with</span></code> to open the file, and assign it a name (<code class="docutils literal notranslate"><span class="pre">f</span></code> in this case).
The file remains open while in the <code class="docutils literal notranslate"><span class="pre">with</span></code> block.
At the end of the <code class="docutils literal notranslate"><span class="pre">with</span></code> block, the file is unconditionally closed, even if an Exception is raised. You code will (mostly) work without it, but it’s a good habit to get into to always use <code class="docutils literal notranslate"><span class="pre">with</span></code> to open a file.</p>
</div>
<div class="section" id="file-writing">
<h3>File Writing<a class="headerlink" href="#file-writing" title="Permalink to this headline"></a></h3>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">outfile</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'output.txt'</span><span class="p">,</span> <span class="s1">'w'</span><span class="p">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">):</span>
<span class="n">outfile</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="s2">"this is line: </span><span class="si">%i</span><span class="se">\n</span><span class="s2">"</span><span class="o">%</span><span class="n">i</span><span class="p">)</span>
<span class="n">outfile</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'output.txt'</span><span class="p">,</span> <span class="s1">'w'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">):</span>
<span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="s2">"this is line: </span><span class="si">%i</span><span class="se">\n</span><span class="s2">"</span><span class="o">%</span><span class="n">i</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="section" id="file-methods">
<h3>File Methods<a class="headerlink" href="#file-methods" title="Permalink to this headline"></a></h3>
<p>Commonly Used Methods:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">()</span> <span class="n">f</span><span class="o">.</span><span class="n">readline</span><span class="p">()</span> <span class="n">f</span><span class="o">.</span><span class="n">readlines</span><span class="p">()</span>
<span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="nb">str</span><span class="p">)</span> <span class="n">f</span><span class="o">.</span><span class="n">writelines</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span>
<span class="n">f</span><span class="o">.</span><span class="n">seek</span><span class="p">(</span><span class="n">offset</span><span class="p">)</span> <span class="n">f</span><span class="o">.</span><span class="n">tell</span><span class="p">()</span> <span class="c1"># for binary files, mostly</span>
<span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</pre></div>
</div>
</div>
<div class="section" id="stringio">
<h3><code class="docutils literal notranslate"><span class="pre">StringIO</span></code><a class="headerlink" href="#stringio" title="Permalink to this headline"></a></h3>
<p>A <code class="docutils literal notranslate"><span class="pre">StringIO</span></code> method is a “file like” object that stores the content in memory.
That is, it has all the methods of a file, and behaves the same way, but never writes anything to disk.</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">In</span> <span class="p">[</span><span class="mi">6</span><span class="p">]:</span> <span class="kn">import</span> <span class="nn">io</span>
<span class="n">In</span> <span class="p">[</span><span class="mi">7</span><span class="p">]:</span> <span class="n">f</span> <span class="o">=</span> <span class="n">io</span><span class="o">.</span><span class="n">StringIO</span><span class="p">()</span>
<span class="n">In</span> <span class="p">[</span><span class="mi">8</span><span class="p">]:</span> <span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="s2">"some stuff"</span><span class="p">)</span>
<span class="n">Out</span><span class="p">[</span><span class="mi">8</span><span class="p">]:</span> <span class="mi">10</span>
<span class="n">In</span> <span class="p">[</span><span class="mi">9</span><span class="p">]:</span> <span class="n">f</span><span class="o">.</span><span class="n">seek</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="n">Out</span><span class="p">[</span><span class="mi">9</span><span class="p">]:</span> <span class="mi">0</span>
<span class="n">In</span> <span class="p">[</span><span class="mi">10</span><span class="p">]:</span> <span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">Out</span><span class="p">[</span><span class="mi">10</span><span class="p">]:</span> <span class="s1">'some stuff'</span>
<span class="n">In</span> <span class="p">[</span><span class="mi">11</span><span class="p">]:</span> <span class="n">f</span><span class="o">.</span><span class="n">getvalue</span><span class="p">()</span>
<span class="n">Out</span><span class="p">[</span><span class="mi">11</span><span class="p">]:</span> <span class="s1">'some stuff'</span>
<span class="n">In</span> <span class="p">[</span><span class="mi">12</span><span class="p">]:</span> <span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</pre></div>
</div>
<p>(This can be handy for testing file handling code…)</p>
</div>
</div>
<div class="section" id="paths-and-directories">
<h2>Paths and Directories<a class="headerlink" href="#paths-and-directories" title="Permalink to this headline"></a></h2>
<div class="section" id="paths">
<h3>Paths<a class="headerlink" href="#paths" title="Permalink to this headline"></a></h3>
<p>Paths are generally handled with simple strings.</p>
<p>Relative paths:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="s1">'secret.txt'</span>
<span class="s1">'./secret.txt'</span>
</pre></div>
</div>
<p>Absolute paths:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="s1">'/home/chris/secret.txt'</span>
</pre></div>
</div>
<p>Either works with <code class="docutils literal notranslate"><span class="pre">open()</span></code> , etc.</p>
<p>Relative paths are relative to the current working directory, which is only relevant to command-line programs.</p>
</div>
<div class="section" id="os-module">
<h3><code class="docutils literal notranslate"><span class="pre">os</span></code> module<a class="headerlink" href="#os-module" title="Permalink to this headline"></a></h3>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">os</span><span class="o">.</span><span class="n">getcwd</span><span class="p">()</span>
<span class="n">os</span><span class="o">.</span><span class="n">chdir</span><span class="p">(</span><span class="n">path</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="section" id="os-path-module">
<h3><code class="docutils literal notranslate"><span class="pre">os.path</span></code> module<a class="headerlink" href="#os-path-module" title="Permalink to this headline"></a></h3>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">split</span><span class="p">()</span>
<span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">splitext</span><span class="p">()</span>
<span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">basename</span><span class="p">()</span>
<span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">()</span>
<span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">()</span>
<span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">abspath</span><span class="p">()</span>
<span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">relpath</span><span class="p">()</span>
</pre></div>
</div>
<p>(all platform independent)</p>
</div>
<div class="section" id="directories">
<h3>Directories<a class="headerlink" href="#directories" title="Permalink to this headline"></a></h3>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">os</span><span class="o">.</span><span class="n">listdir</span><span class="p">()</span>
<span class="n">os</span><span class="o">.</span><span class="n">mkdir</span><span class="p">()</span>
<span class="n">os</span><span class="o">.</span><span class="n">walk</span><span class="p">()</span>
</pre></div>
</div>
<p>(Note the <code class="docutils literal notranslate"><span class="pre">shutil</span></code> module provides higher level operations.)</p>
</div>
<div class="section" id="pathlib">
<h3>pathlib<a class="headerlink" href="#pathlib" title="Permalink to this headline"></a></h3>
<p><code class="docutils literal notranslate"><span class="pre">pathlib</span></code> is a package for handling paths in an OO way:</p>
<p><a class="reference external" href="http://pathlib.readthedocs.org/en/pep428/">http://pathlib.readthedocs.org/en/pep428/</a></p>
<p>All the stuff in os.path and more:</p>
<div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [14]: </span><span class="kn">import</span> <span class="nn">pathlib</span>
<span class="gp">In [15]: </span><span class="n">pth</span> <span class="o">=</span> <span class="n">pathlib</span><span class="o">.</span><span class="n">Path</span><span class="p">(</span><span class="s1">'./'</span><span class="p">)</span>
<span class="gp">In [16]: </span><span class="n">pth</span><span class="o">.</span><span class="n">is_dir</span><span class="p">()</span>
<span class="gh">Out[16]: </span><span class="go">True</span>
<span class="gp">In [17]: </span><span class="n">pth</span><span class="o">.</span><span class="n">absolute</span><span class="p">()</span>
<span class="gh">Out[17]: </span><span class="go">PosixPath('/Users/Chris/PythonStuff/UWPCE/Fall2018-PY210A/examples/Session02')</span>
<span class="gp">In [18]: </span><span class="k">for</span> <span class="n">f</span> <span class="ow">in</span> <span class="n">pth</span><span class="o">.</span><span class="n">iterdir</span><span class="p">():</span>
<span class="go"> ...: print(f)</span>
<span class="go"> ...:</span>
<span class="go"> ...:</span>
</pre></div>
</div>
<p>And it has a really nifty way to join paths, by overloading the “division” operator:</p>
<div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [49]: </span><span class="n">p</span> <span class="o">=</span> <span class="n">pathlib</span><span class="o">.</span><span class="n">Path</span><span class="o">.</span><span class="n">home</span><span class="p">()</span> <span class="c1"># create a path to the user home dir.</span>
<span class="gp">In [50]: </span><span class="n">p</span>
<span class="gh">Out[50]: </span><span class="go">PosixPath('/Users/Chris')</span>
<span class="gp">In [51]: </span><span class="n">p</span> <span class="o">/</span> <span class="s2">"a_dir"</span> <span class="o">/</span> <span class="s2">"one_more"</span> <span class="o">/</span> <span class="s2">"a_filename"</span>
<span class="gh">Out[51]: </span><span class="go">PosixPath('/Users/Chris/a_dir/one_more/a_filename')</span>
</pre></div>
</div>
<p>Kinda slick, eh?</p>
<p>For the full docs:</p>
<p><a class="reference external" href="https://docs.python.org/3/library/pathlib.html">https://docs.python.org/3/library/pathlib.html</a></p>
</div>
<div class="section" id="the-path-protocol">
<h3>The Path Protocol<a class="headerlink" href="#the-path-protocol" title="Permalink to this headline"></a></h3>
<p>As of Python 3.6, there is now a protocol for making arbitrary objects act like paths:</p>
<p>Read about it in PEP 519:</p>
<p><a class="reference external" href="https://www.python.org/dev/peps/pep-0519/">https://www.python.org/dev/peps/pep-0519/</a></p>
<p>This was added because most built-in file handling modules, as well as any number of third party packages that needed a path, worked only with string paths.</p>
<p>Even after <code class="docutils literal notranslate"><span class="pre">pathlib</span></code> was added to the standard library, you couldn’t pass a <code class="docutils literal notranslate"><span class="pre">Path</span></code> object in where a path was needed –even the most common ones like <code class="docutils literal notranslate"><span class="pre">open()</span></code>.</p>
<p>So you could use the nifty path manipulation stuff, but still needed to call <code class="docutils literal notranslate"><span class="pre">str</span></code> on it:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">p</span> <span class="o">=</span> <span class="n">pathlib</span><span class="o">.</span><span class="n">Path</span><span class="o">.</span><span class="n">home</span><span class="p">()</span> <span class="o">/</span> <span class="n">a_filename</span><span class="o">.</span><span class="n">txt</span>
<span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">p</span><span class="p">),</span> <span class="s1">'r'</span><span class="p">)</span>
</pre></div>
</div>
<p>Rather than add explicit support for <code class="docutils literal notranslate"><span class="pre">Path</span></code> objects, a new protocol was defined, and most of the standard library was updated to support the new protocol.</p>
<p>This way, third party path libraries could be used with the standard library as well.</p>
</div>
<div class="section" id="what-this-means-to-you">
<h3>What this means to you<a class="headerlink" href="#what-this-means-to-you" title="Permalink to this headline"></a></h3>
<p>Unless you are writing a path manipulation library, or a library that deals with paths other than with the stdlib packages (like <code class="docutils literal notranslate"><span class="pre">open()</span></code>), all you need to know is that you can use <code class="docutils literal notranslate"><span class="pre">Path</span></code> objects most places you need a path.</p>
<p>I expect we will see expanded use of pathlib as python 3.6 and 3.7 becomes widely used.</p>
</div>
</div>
<div class="section" id="some-added-notes">
<h2>Some added notes:<a class="headerlink" href="#some-added-notes" title="Permalink to this headline"></a></h2>
<div class="section" id="using-files-and-with">
<h3>Using files and “with”<a class="headerlink" href="#using-files-and-with" title="Permalink to this headline"></a></h3>
<p>Sorry for the confusion, but I’ll be more clear now.</p>
<p>When working with files, unless you have a good reason not to, use <code class="docutils literal notranslate"><span class="pre">with</span></code>:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">the_filename</span><span class="p">,</span> <span class="s1">'w'</span><span class="p">)</span> <span class="k">as</span> <span class="n">outfile</span><span class="p">:</span>
<span class="n">outfile</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">something</span><span class="p">)</span>
<span class="n">do_some_more</span><span class="o">...</span>
<span class="c1"># now done with out file -- it will be closed, regardless of errors, etc.</span>
<span class="n">do_other_stuff</span>
</pre></div>
</div>
<p><code class="docutils literal notranslate"><span class="pre">with</span></code> invokes a context manager – which can be confusing, but for now, just follow this pattern – it really is more robust.</p>
<p>And you can even do two at once:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">source</span><span class="p">,</span> <span class="s1">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">infile</span><span class="p">,</span> <span class="nb">open</span><span class="p">(</span><span class="n">dest</span><span class="p">,</span> <span class="s1">'wb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">outfile</span><span class="p">:</span>
<span class="n">outfile</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">infile</span><span class="o">.</span><span class="n">read</span><span class="p">())</span>
</pre></div>
</div>
</div>
<div class="section" id="id2">
<h3>Binary files<a class="headerlink" href="#id2" title="Permalink to this headline"></a></h3>
<p>Python can open files in one of two modes:</p>
<blockquote>
<div><ul class="simple">
<li><p>Text</p></li>
<li><p>Binary</p></li>
</ul>
</div></blockquote>
<p>This is just what you’d think – if the file contains text, you want text mode. If the file contains arbitrary binary data, you want binary mode.</p>
<p>All data in all files is binary – that’s how computers work. So in Python3, “text” actually means Unicode – which is a particular system for matching characters to binary data.</p>
<p>But this too is complicated – there are multiple ways that binary data can be mapped to Unicode text, known as “encodings”. In Python, text files are by default opened with the “utf-8” encoding. These days, that mostly “just works”.</p>
<p>But if you read a binary file as text, then Python will try to interpret the bytes as utf-8 encoded text – and this will likely fail:</p>
<div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [13]: </span><span class="nb">open</span><span class="p">(</span><span class="s2">"a_photo.jpg"</span><span class="p">)</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="gt">---------------------------------------------------------------------------</span>
<span class="ne">UnicodeDecodeError</span><span class="g g-Whitespace"> </span>Traceback (most recent call last)
<span class="nn"><ipython-input-13-5c699bc20e80></span> in <span class="ni"><module></span><span class="nt">()</span>
<span class="ne">----> </span><span class="mi">1</span> <span class="nb">open</span><span class="p">(</span><span class="s2">"PassportPhoto.JPG"</span><span class="p">)</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="nn">/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/codecs.py</span> in <span class="ni">decode</span><span class="nt">(self, input, final)</span>
<span class="g g-Whitespace"> </span><span class="mi">319</span> <span class="c1"># decode input (taking the buffer into account)</span>
<span class="g g-Whitespace"> </span><span class="mi">320</span> <span class="n">data</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">buffer</span> <span class="o">+</span> <span class="nb">input</span>
<span class="ne">--> </span><span class="mi">321</span> <span class="p">(</span><span class="n">result</span><span class="p">,</span> <span class="n">consumed</span><span class="p">)</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_buffer_decode</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">errors</span><span class="p">,</span> <span class="n">final</span><span class="p">)</span>
<span class="g g-Whitespace"> </span><span class="mi">322</span> <span class="c1"># keep undecoded input until the next call</span>
<span class="g g-Whitespace"> </span><span class="mi">323</span> <span class="bp">self</span><span class="o">.</span><span class="n">buffer</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">consumed</span><span class="p">:]</span>
<span class="ne">UnicodeDecodeError</span>: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
</pre></div>
</div>
<p>In Python2, it’s less likely that you’ll get an error like this – it doesn’t try to decode the file as it’s read – even for text files – so it’s a bit tricky and more error prone.</p>
<p><strong>NOTE:</strong> If you want to actually DO anything with a binary file, other than passing it around, then you’ll need to know a lot about how the details of what the bytes in the file mean – and most likely, you’ll use a library for that – like an image processing library for the jpeg example above.</p>
</div>
</div>
</div>
</div>
</div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="../topics/09-files/index.html" class="btn btn-neutral float-left" title="9. File Handling" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="../exercises/file_processing/file_lab.html" class="btn btn-neutral float-right" title="File Exercise" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>
<hr/>
<div role="contentinfo">
<p>© Copyright 2020, University of Washington, Natasha Aleksandrova, Christopher Barker, Brian Dorsey, Cris Ewing, Christy Heaton, Jon Jacky, Maria McKinley, Andy Miles, Rick Riehle, Joseph Schilz, Joseph Sheedy, Hosung Song. Creative Commons Attribution-ShareAlike 4.0 license.</p>
</div>
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script>
jQuery(function () {
SphinxRtdTheme.Navigation.enable(true);
});
</script>
</body>
</html>