How do I create a new line with reStructuredText? - line-breaks

How do I force a line break/new line in rst? I don't want it to be a new paragraph (ie. no additional spaces between the lines), I just want the text to start on a new line. Thanks!

The line block syntax also worked, and was a bit cleaner:
| This is a line
| This is another line
| Another new line

According to the docs for docutils raw role, you can do this:
If there just *has* to be a line break here,
:raw-html:`<br />`
it can be accomplished with a "raw"-derived role.
But the line block syntax should be considered first.
You will need to define the raw role first:
.. role:: raw-html(raw)
:format: html
As the example states, consider line block syntax first.
| Lend us a couple of bob till Thursday.
| I'm absolutely skint.
| But I'm expecting a postal order and I can pay you back
as soon as it comes.
| Love, Ewan.

We added a substitution to our global.rst file so we didn't have to add raw tags all the time.
.. |br| raw:: html
<br/>
This way we can just use |br| when we need a line break.

Related

Python strip() and readlines()

I have a code that I am trying to run which will compare a value from a csv file to a threshold that I have set within the py file.
My csv file has an output similar to below, but with 1030 lines
-46.62
-47.42
-47.36
-47.27
-47.36
-47.24
-47.24
-47.03
-47.12
Note: there are no lines between the values but there is a single space before them.
My first attempt was with this code:
file_in5 = open('710_edited_capture.csv', 'r')
line5=file_in5.readlines()
a=line5[102]
b=line5[307]
c=line5[512]
d=line5[717]
e=line5[922]
print[a]
print[b]
print[c]
print[d]
print[e]
which gave the output of:
[' -44.94\n']
[' -45.06\n']
[' -45.09\n']
[' -45.63\n']
[' -45.92\n']
My first thought was to use .strip() to remove the space and the \n but this is not supported in lists and returns the error:
Traceback (most recent call last):
File "/root/test.py", line 101, in <module>
line5=line5.strip()
AttributeError: 'list' object has no attribute 'strip'
My next code below:
for line5 in file_in5:
line5=line5.strip()
line5=file_in5.readlines()
a=line5[102]
b=line5[307]
c=line5[512]
d=line5[717]
e=line5[922]
print[a]
print[b]
print[c]
print[d]
print[e]
Returns another error:
Traceback (most recent call last):
File "/root/test.py", line 91, in <module>
line5=file_in5.readlines()
ValueError: Mixing iteration and read methods would lose data
What is the most efficient way to read in just 5 specific lines without any spaces or \n, and then be able to use them in subsequent calculations such as:
if a>threshold and a>b and a>c and a>d and a>e:
print ('a is highest and within limit')
CF=a
You can use strip(), but you need to use read() instead of readlines(). Another way, if you have more than one value in a row with comma separation, you can use the code as below:
with open('710_edited_capture.csv', 'r') as file:
file_content=file.readlines()
for line in file_content:
vals = line.strip().split(',')
print(vals)
You can also append "vals" to an empty list. As a result, you will get a list that contains a list of values for each line.
it's a little bit unclear what you want to do but if you just want to read a file compare each value to a threshold value and keep upper value here a example :
threshold=46.2
outlist=[]
with open('data.csv', 'r') as data:
for i in data:
if float(i)>threshold:
outlist.append(i)
then you can adapt it to your needs...
Thanks for all the comments and suggestions however they are not quite what I needed.
I have however applied a workaround, although admittedly clunky.
I have created 5 additional files from the original with only the one value in each. From this I can now strip the space and /n and save them locally as a variable. I no longer needed the readlines
These variables can be compared to each other and the threshold to determine the optimum choice.

What's the best way to match strings in a file to case class in Scala?

We have a file that contains data that we want to match to a case class. I know enough to brute force it but looking for an idiomatic way in scala.
Given File:
#record
name:John Doe
age: 34
#record
name: Smith Holy
age: 33
# some comment
#record
# another comment
name: Martin Fowler
age: 99
(field values on two lines are INVALID, e.g. name:John\n Smith should error)
And the case class
case class Record(name:String, age:Int)
I Want to return a Seq type such as Stream:
val records: Stream records
The couple of ideas i'm working with but so far haven't implemented is:
Remove all new lines and treat the whole file as one long string. Then grep match on the string "((?!name).)+((?!age).)+age:([\s\d]+)" and create a new object of my case class for each match but so far my regex foo is low and can't match around comments.
Recursive idea: Iterate through each line to find the first line that matches record, then recursively call the function to match name, then age. Tail recursively return Some(new Record(cumulativeMap.get(name), cumulativeMap.get(age)) or None when hitting the next record after name (i.e. age was never encountered)
?? Better Idea?
Thanks for reading! The file is more complicated than above but all rules are equal. For the curious: i'm trying to parse a custom M3U playlist file format.
I'd use kantan.regex for a fairly trivial regex based solution.
Without fancy shapeless derivation, you can write the following:
import kantan.regex._
import kantan.regex.implicits._
case class Record(name:String, age:Int)
implicit val decoder = MatchDecoder.ordered(Record.apply _)
input.evalRegex[Record](rx"(?:name:\s*([^\n]+))\n(?:age:\s*([0-9]+))").toList
This yields:
List(Success(Record(John Doe,34)), Success(Record(Smith Holy,33)), Success(Record(Martin Fowler,99)))
Note that this solution requires you to hand-write decoder, but it can often be automatically derived. If you don't mind a shapeless dependency, you could simply write:
import kantan.regex._
import kantan.regex.implicits._
import kantan.regex.generic._
case class Record(name:String, age:Int)
input.evalRegex[Record](rx"(?:name:\s*([^\n]+))\n(?:age:\s*([0-9]+))").toList
And get the exact same result.
Disclaimer: I'm the library's author.
You could use Parser Combinators.
If you have the file format specification in BNF or can write one, then Scala can create a parser for you from those rules. This may be more robust than hand-made regex based parsers. It's certainly more "Scala".
I don't have much experience in Scala, but could these regexes work:
You could use (?<=name:).* to match name value, and (?<=age:).* to match the age value. If you use this, remove spaces in found matches, otherwise name: bob will match bob with a space before, you might not want that.
If name: or any other tag is in comment, or comment is after value, something will be matched. Please leave a comment if you want to avoid that.
You could try this:
Path file = Paths.get("file.txt");
val lines = Files.readAllLines(file, Charset.defaultCharset());
val records = lines.filter(s => s.startsWith("age:") || s.startsWith("name:"))
.grouped(2).toList.map {
case List(a, b) => Record(a.replaceAll("name:", "").trim,
b.replaceAll("age:", "").trim.toInt)
}

Hello I have a code that prints what I need in python but i'd like it to write that result to a new file

The file look like a series of lines with IDs:
aaaa
aass
asdd
adfg
aaaa
I'd like to get in a new file the ID and its occurrence in the old file as the form:
aaaa 2
asdd 1
aass 1
adfg 1
With the 2 element separated by tab.
The code i have print what i want but doesn't write in a new file:
with open("Only1ID.txt", "r") as file:
file = [item.lower().replace("\n", "") for item in file.readlines()]
for item in sorted(set(file)):
print item.title(), file.count(item)
As you use Python 2, the simplest approach to convert your console output to file output is by using the print chevron (>>) syntax which redirects the output to any file-like object:
with open("filename", "w") as f: # open a file in write mode
print >> f, "some data" # print 'into the file'
Your code could look like this after simply adding another open to open the output file and adding the chevron to your print statement:
with open("Only1ID.txt", "r") as file, open("output.txt", "w") as out_file:
file = [item.lower().replace("\n", "") for item in file.readlines()]
for item in sorted(set(file)):
print >> out_file item.title(), file.count(item)
However, your code has a few other more or less bad things which one should not do or could improve:
Do not use the same variable name file for both the file object returned by open and your processed list of strings. This is confusing, just use two different names.
You can directly iterate over the file object, which works like a generator that returns the file's lines as strings. Generators process requests for the next element just in time, that means it does not first load the whole file into your memory like file.readlines() and processes them afterwards, but only reads and stores one line at a time, whenever the next line is needed. That way you improve the code's performance and resource efficiency.
If you write a list comprehension, but you don't need its result necessarily as list because you simply want to iterate over it using a for loop, it's more efficient to use a generator expression (same effect as the file object's line generator described above). The only syntactical difference between a list comprehension and a generator expression are the brackets. Replace [...] with (...) and you have a generator. The only downside of a generator is that you neither can find out its length, nor can you access items directly using an index. As you don't need any of these features, the generator is fine here.
There is a simpler way to remove trailing newline characters from a line: line.rstrip() removes all trailing whitespaces. If you want to keep e.g. spaces, but only want the newline to be removed, pass that character as argument: line.rstrip("\n").
However, it could possibly be even easier and faster to just not add another implicit line break during the print call instead of removing it first to have it re-added later. You would suppress the line break of print in Python 2 by simply adding a comma at the end of the statement:
print >> out_file item.title(), file.count(item),
There is a type Counter to count occurrences of elements in a collection, which is faster and easier than writing it yourself, because you don't need the additional count() call for every element. The Counter behaves mostly like a dictionary with your items as keys and their count as values. Simply import it from the collections module and use it like this:
from collections import Counter
c = Counter(lines)
for item in c:
print item, c[item]
With all those suggestions (except the one not to remove the line breaks) applied and the variables renamed to something more clear, the optimized code looks like this:
from collections import Counter
with open("Only1ID.txt") as in_file, open("output.txt", "w") as out_file:
counter = Counter(line.lower().rstrip("\n") for line in in_file)
for item in sorted(counter):
print >> out_file item.title(), counter[item]

Removing Duplicate Lines by Title Only

I am trying to modify a script so that it will remove duplicate lines from a text file using only the title portion of that line.
To clarify the text file lines look something like this:
Title|Image Url|Description|Page Url
At the moment the script does remove duplicates, but it does so by reading the entire line, not just the first part. All the lines in the file are not going to be 100% the same, but a few will be very similar.
I want to remove all of the lines that contain the same "title", regardless of what the rest of the line contains.
This is the script I am working with:
import sys
from collections import OrderedDict
infile = "testfile.txt"
outfile = "outfile.txt"
inf = open(infile,"r")
lines = inf.readlines()
inf.close()
newset = list(OrderedDict.fromkeys(lines))
outf = open(outfile,"w")
lstline = len(newset)
for i in range(0,lstline):
ln = newset[i]
outf.write(ln)
outf.close()
So far I have tried using .split() to split the lines in the list. I have also tried .readline(lines[0:25]) in hopes of using a character limit to achieve the desired results, but no luck so far. I also can't seem to find any documentation on my exact problem so I'm stuck.
I am using Windows 8 and Python 2.7.9 for this project if that helps.
I made a few changes to the program you had set up. First, I changed your file interactions to use "with" statements, since those are very convenient and automatically handle a lot of the functionality you had to write out. Second off, I used a set instead of an OrderedDict because you were basically just trying to emulate set functionality (exclusivity of elements) by using keys in an OrderedDict. If the title hasn't been used, it adds it to the set so it can't be used again and prints the line to the output file. If it has been used, it keeps going. I hope this helps you!
with open("testfile.txt") as infile:
with open("outfile.txt",'w') as outfile:
titleset = set()
for line in infile:
title = line.split('|')[0]
if title not in titleset:
titleset.add(title)
outfile.write(line)

How do i print each line here in a for loop

thanks for the follow :)
hii... if u want to make a new friend just add me on facebook! :) xx
Just wanna say if you ever feel lonely or sad or bored, just come and talk to me. I'm free anytime :)
I hope she not a spy for someone. I hope she real on neautral side. Because just her who i trust. :-)
not always but sometimes maybe :)
\u201c Funny how you get what you want and pray for when you want the same thing God wants. :)
Thank you :) can you follow me on Twitter so I can DM you?
RT dj got us a fallin in love and yeah earth number one m\u00fcsic listen thank you king :-)
found a cheeky weekend for \u00a380 return that's flights + hotel.. middle of april, im still looking pal :)
RT happy birthday mary ! Hope you have a good day :)
Thank god twitters not blocked on the school computers cause all my data is gone on my phone :(
enjoy tmrro. saw them earlier this wk here in tokyo :)
UPDATE:
Oki, maybe my question was wrong. I have to do this:
Open file and read from it
Remove some links, names and stuff from it (I have used regex, but don't know if it the right way to do
After i got clean code (only tweets with sad face or happy face) i have to print each line out, cause i have to loop each like this:
for line in tweets:
if '' in line:
cl.train(line,'happy')
else if '' in line:
cl.train(line,'sad')
My code so far you see here, but it doesn't work yet.
import re
from pprint import pprint
tweets = []
tweets = open('englishtweet.txt').read()
regex_username = '#[^\s]*' # regex to detect username in file
regex_url = 'http[^\s]*' # regex to detect url in file
regex_names = '#[^\s]*' # regex to detect # in file
for username in re.findall(regex_username, tweets):
tweets = tweets.replace(username, '')
for url in re.findall(regex_url, tweets):
tweets = tweets.replace(url, '')
for names in re.findall(regex_names, tweets):
tweets = tweets.replace(names, '')
If you want to read the first line, use next
with open("englishtweet.txt","r") as infile:
print next(infile).strip()
# this prints the first line only, and consumes the first value from the
# generator so this:
for line in infile:
print line.strip()
# will print every line BUT the first (since the first has been consumed)
I'm also using a context manager here, which will automatically close the file once you exit the with block instead of having to remember to call tweets.close(), and also will handle in case of error (depending on what else you're doing in your file, you may throw a handled exception that doesn't allow you to get to the .close statement).
If your file is very small, you could use .readlines:
with open("englishtweet.txt","r") as infile:
tweets = infile.readlines()
# tweets is now a list, each element is a separate line from the file
print tweets[0] # so element 0 is the first line
for line in tweets[1:]: # the rest of the lines:
print line.strip()
However that's not really suggested to read a whole file object into memory, as with some files it can simply be a huge memory waster, especially if you only need the first line -- no reason to read the whole thing to memory.
That said, since it looks like you may be using these for more than just one iteration, maybe readlines IS the best approach
You almost have it. Just remove the .read() when you originally open the file. Then you can loop through the lines.
tweets = open('englishtweet.txt','r')
for line in tweets:
print line
tweets.close()