forked from PolMine/RcppCWB
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathget_count_vector.html
More file actions
140 lines (110 loc) · 10.6 KB
/
get_count_vector.html
File metadata and controls
140 lines (110 loc) · 10.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width, initial-scale=1.0"><title>Get Vector with Counts for Positional Attribute. — get_count_vector • RcppCWB</title><!-- jquery --><script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js" integrity="sha256-CSXorXvZcTkaix6Yvo6HppcZGetbYMGWSFlBw8HfCJo=" crossorigin="anonymous"></script><!-- Bootstrap --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.4.1/css/bootstrap.min.css" integrity="sha256-bZLfwXAP04zRMK2BjiO8iu9pf4FbLqX6zitd+tIvLhE=" crossorigin="anonymous"><script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.4.1/js/bootstrap.min.js" integrity="sha256-nuL8/2cJ5NDSSwnKD8VqreErSWHtnEP9E7AySL+1ev4=" crossorigin="anonymous"></script><!-- bootstrap-toc --><link rel="stylesheet" href="../bootstrap-toc.css"><script src="../bootstrap-toc.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous"><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous"><!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- pkgdown --><link href="../pkgdown.css" rel="stylesheet"><script src="../pkgdown.js"></script><meta property="og:title" content="Get Vector with Counts for Positional Attribute. — get_count_vector"><meta property="og:description" content="The return value is an integer vector. The length of the vector is the number of
unique tokens in the corpus / the number of unique ids. The order of the counts
corresponds to the number of ids."><!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]--></head><body data-spy="scroll" data-target="#toc">
<div class="container template-reference-topic">
<header><div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">RcppCWB</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="">0.6.0</span>
</span>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav"><li>
<a href="../reference/index.html">Reference</a>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" data-bs-toggle="dropdown" aria-expanded="false">
Articles
<span class="caret"></span>
</a>
<ul class="dropdown-menu" role="menu"><li>
<a href="../articles/vignette.html">Writing performance code with RcppCWB</a>
</li>
</ul></li>
<li>
<a href="../news/index.html">Changelog</a>
</li>
</ul><ul class="nav navbar-nav navbar-right"><li>
<a href="https://github.com/PolMine/RcppCWB/" class="external-link">
<span class="fab fa-github fa-lg"></span>
</a>
</li>
</ul></div><!--/.nav-collapse -->
</div><!--/.container -->
</div><!--/.navbar -->
</header><div class="row">
<div class="col-md-9 contents">
<div class="page-header">
<h1>Get Vector with Counts for Positional Attribute.</h1>
<small class="dont-index">Source: <a href="https://github.com/PolMine/RcppCWB/blob/HEAD/R/count.R" class="external-link"><code>R/count.R</code></a></small>
<div class="hidden name"><code>get_count_vector.Rd</code></div>
</div>
<div class="ref-description">
<p>The return value is an integer vector. The length of the vector is the number of
unique tokens in the corpus / the number of unique ids. The order of the counts
corresponds to the number of ids.</p>
</div>
<div id="ref-usage">
<div class="sourceCode"><pre class="sourceCode r"><code><span><span class="fu">get_count_vector</span><span class="op">(</span><span class="va">corpus</span>, <span class="va">p_attribute</span>, registry <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/Sys.getenv.html" class="external-link">Sys.getenv</a></span><span class="op">(</span><span class="st">"CORPUS_REGISTRY"</span><span class="op">)</span><span class="op">)</span></span></code></pre></div>
</div>
<div id="arguments">
<h2>Arguments</h2>
<dl><dt>corpus</dt>
<dd><p>a CWB corpus</p></dd>
<dt>p_attribute</dt>
<dd><p>a positional attribute</p></dd>
<dt>registry</dt>
<dd><p>registry directory</p></dd>
</dl></div>
<div id="value">
<h2>Value</h2>
<p>an integer vector</p>
</div>
<div id="ref-examples">
<h2>Examples</h2>
<div class="sourceCode"><pre class="sourceCode r"><code><span class="r-in"><span><span class="va">y</span> <span class="op"><-</span> <span class="fu">get_count_vector</span><span class="op">(</span></span></span>
<span class="r-in"><span> corpus <span class="op">=</span> <span class="st">"REUTERS"</span>, p_attribute <span class="op">=</span> <span class="st">"word"</span>,</span></span>
<span class="r-in"><span> registry <span class="op">=</span> <span class="fu"><a href="tmp_registry.html">get_tmp_registry</a></span><span class="op">(</span><span class="op">)</span></span></span>
<span class="r-in"><span> <span class="op">)</span></span></span>
<span class="r-in"><span><span class="va">df</span> <span class="op"><-</span> <span class="fu"><a href="https://rdrr.io/r/base/data.frame.html" class="external-link">data.frame</a></span><span class="op">(</span>token_id <span class="op">=</span> <span class="fl">0</span><span class="op">:</span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/length.html" class="external-link">length</a></span><span class="op">(</span><span class="va">y</span><span class="op">)</span> <span class="op">-</span> <span class="fl">1</span><span class="op">)</span>, count <span class="op">=</span> <span class="va">y</span><span class="op">)</span></span></span>
<span class="r-in"><span><span class="va">df</span><span class="op">[[</span><span class="st">"token"</span><span class="op">]</span><span class="op">]</span> <span class="op"><-</span> <span class="fu"><a href="p_attributes.html">cl_id2str</a></span><span class="op">(</span></span></span>
<span class="r-in"><span> <span class="st">"REUTERS"</span>, p_attribute <span class="op">=</span> <span class="st">"word"</span>,</span></span>
<span class="r-in"><span> id <span class="op">=</span> <span class="va">df</span><span class="op">[[</span><span class="st">"token_id"</span><span class="op">]</span><span class="op">]</span>, registry <span class="op">=</span> <span class="fu"><a href="tmp_registry.html">get_tmp_registry</a></span><span class="op">(</span><span class="op">)</span></span></span>
<span class="r-in"><span> <span class="op">)</span></span></span>
<span class="r-in"><span><span class="va">df</span> <span class="op"><-</span> <span class="va">df</span><span class="op">[</span>,<span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">"token"</span>, <span class="st">"token_id"</span>, <span class="st">"count"</span><span class="op">)</span><span class="op">]</span> <span class="co"># reorder columns</span></span></span>
<span class="r-in"><span><span class="va">df</span> <span class="op"><-</span> <span class="va">df</span><span class="op">[</span><span class="fu"><a href="https://rdrr.io/r/base/order.html" class="external-link">order</a></span><span class="op">(</span><span class="va">df</span><span class="op">[[</span><span class="st">"count"</span><span class="op">]</span><span class="op">]</span>, decreasing <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span>,<span class="op">]</span></span></span>
<span class="r-in"><span><span class="fu"><a href="https://rdrr.io/r/utils/head.html" class="external-link">head</a></span><span class="op">(</span><span class="va">df</span><span class="op">)</span></span></span>
<span class="r-out co"><span class="r-pr">#></span> token token_id count</span>
<span class="r-out co"><span class="r-pr">#></span> 32 the 31 206</span>
<span class="r-out co"><span class="r-pr">#></span> 30 to 29 134</span>
<span class="r-out co"><span class="r-pr">#></span> 38 of 37 97</span>
<span class="r-out co"><span class="r-pr">#></span> 36 in 35 84</span>
<span class="r-out co"><span class="r-pr">#></span> 16 oil 15 78</span>
<span class="r-out co"><span class="r-pr">#></span> 41 and 40 77</span>
</code></pre></div>
</div>
</div>
<div class="col-md-3 hidden-xs hidden-sm" id="pkgdown-sidebar">
<nav id="toc" data-toggle="toc" class="sticky-top"><h2 data-toc-skip>Contents</h2>
</nav></div>
</div>
<footer><div class="copyright">
<p></p><p>Developed by Andreas Blaette, Bernard Desgraupes, Sylvain Loiseau.</p>
</div>
<div class="pkgdown">
<p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.6.</p>
</div>
</footer></div>
</body></html>