Openstack RDO && KVM Hypervisor: December 2021

Thursday, December 23, 2021

Uploading dictionaries into SQLite tables via Pandas DataFrames on Python 3.9.9

The following below is a not entirely trivial algorithm for students that creates a Python dictionary after scanning text according to the conditions of the problem, which gets converted into Pandas Dataframe and uploaded to SQLite database table to be analyzed via nested SQL Query.

Problem itself

A text file (beta-wide.txt) consists of many lines, each of which can contain letters from the set A, B, C, D, E, F. For the entire file, you need to determine which letter is most often forms substrings of maximum length. As an answer, you must enter a string consisting of the found letters, alphabetically without delimiters

(.env) [boris@fedora34server WORK]$ ll beta-wide.txt

-rw-rw-r--. 1 boris boris 59420 Dec 23 16:20 beta-wide.txt

Python code building dictionary to be uploaded into SQLite table

via pandas DataFrame

import pandas as pd

import sqlite3

connection = sqlite3.connect('data_pandas.db')

c = connection.cursor()

c.execute('CREATE TABLE IF NOT EXISTS results(symv_name text, symv_qnt number)')

connection.commit()

seqLetter = []

file = open('./beta-wide.txt')

for line in file:

currLetrs = set()

curLen = 0

maxLen = 1

prev = ' '

for next in line:

if prev == next:

curLen +=1

if curLen > maxLen:

maxLen = curLen

currLetrs = set([next])

elif maxLen == curLen:

currLetrs.add(next)

else:

curLen =1

prev = next

seqLetter = seqLetter + list(currLetrs)

d = {}

for i in set(seqLetter):

key = i

d[key] = d[key] = d.get(key,0) + seqLetter.count(i)

data_items = d.items()

data_list = list(data_items)

df = pd.DataFrame(data_list, columns= ['symv_name','symv_qnt'])

df.to_sql('results', connection, if_exists='replace', index = False)

query = "SELECT * FROM results WHERE symv_qnt = (SELECT max(symv_qnt) FROM results)"

data = pd.read_sql_query(query,connection)

print(data)

SQLiteBrowser double check

Monday, December 13, 2021

Exersing Python 3.10 on Fedora 25 via sorting Pandas DataFrame

The following below is a not entirely trivial algorithm for students that creates a Python dictionary when scanning text according to the conditions of the problem, which is then processed as Pandas Dataframe to provide the output required

The original problem itself

The text file alpha-wide.txt contains only capital letters of the Latin alphabet (ABC… Z). Identify the character that most often occurs in the file immediately after the letter X. In the answer first write down this character, and then separate by blank how many times it occurred after letters X. If there are several such characters, you need to display the one that appears earlier in the alphabet.

The file alpha-wide.txt contains a pretty long string

Python code has been set up in a virtual environment with pandas installed via pip

import pandas as pd

with open("./alpha-wide.txt") as F:

string = F.readline() # getting next line

d={} # declare dictionary of letters

while "X" in string:

n = string.find("X") # index "X"

key = string[n+1] # create key = Letter

d[key] = d.get(key,0)+1

string = string[n+1:] # slice, form a new line

# print(d)

data_items = d.items()

data_list = list(data_items)

df = pd.DataFrame(data_list,columns = ['X1','Y1'])

df.sort_values("Y1", axis = 0, ascending = True, inplace = True, na_position ='last')

print("DataFrame has been sorted by Y1")

print(df)

Sunday, December 12, 2021

Exerising Python 3.10 on Fedora 25 via sorting Pandas DataFrame

The original problem itself

The file alpha-wide.txt contains a pretty long string

Python code has been set up in a virtual environment with pandas installed via pip
import pandas as pd
with open("./alpha-wide.txt") as F:
string = F.readline() # getting next line
d={} # declare dictionary of letters
while "X" in string:
n = string.find("X") # index "X"
key = string[n+1] # create key = Letter
d[key] = d.get(key,0)+1
string = string[n+1:] # slice, form a new line
# print(d)
data_items = d.items()
data_list = list(data_items)
df = pd.DataFrame(data_list,columns = ['X1','Y1'])
df.sort_values("Y1", axis = 0, ascending = True, inplace = True, na_position ='last')
print("DataFrame has been sorted by Y1")
print(df)

System information