[–]▶ No.808650>>808718 >>808788 >>809600 >>809861 >>811008 [Watch Thread][Show All Posts]
Last thread introduced a lot of bickering and anger, let's see if we can replicate that.
Rules
Commit your solution to the challenge in this thread. (Solutions written by women and solutions written in Rust receive a 10% bonus when being rated)
Challenge
The major element in a sequence with the length of L is the element which appears in a sequence more than L/2 times. The challenge is to find that element in a sequence.
Input sample:
92,19,19,76,19,21,19,85,19,19,19,94,19,19,22,67,83,19,19,54,59,1,19,19
92,11,30,92,1,11,92,38,92,92,43,92,92,51,92,36,97,92,92,92,43,22,84,92,92
4,79,89,98,48,42,39,79,55,70,21,39,98,16,96,2,10,24,14,47,0,50,95,20,95,48,50,12,42
Your program should accept as its first argument a path to a filename. Each line of the file contains a sequence of integers N separated by comma. E.g.
Output sample:
19
92
None
For each sequence print out the major element or print "None" in case there is no such element. E.g.
Constraints:
N is in range [0, 100]
L is in range [10000, 30000]
The number of test cases <= 40
▶ No.808652>>808655
I think there should be a week waiting time before anyone is allowed to reply so OP has to do his homework himself first.
▶ No.808653>>808655 >>808794
1. sort list
2. begin looping through items starting at beginning of list
3. if the current element's value is equal to the value at the current index + L/2, return that element
4. keep going until that value is found, or reached L/2
▶ No.808655>>808660 >>811008
>>808652
Not my homework, nice try.
>>808653
No pesudocode allowed.
▶ No.808660>>811008
>>808655
>>808655
op copied the problem from https://www.codeeval.com/browse/132/
he wants us to come up with an original answer for him to put on his profile
▶ No.808712>>808715
Done. It could be neater, and I've taken some liberties with the output so that you can't use it as homework, but it works.
Save as homework.perl and use it like so:
perl homework.perl "92,19,19,76,19,21,19,85,19,19,19,94,19,19,22,67,83,19,19,54,59,1,19,19"
>19
perl homework.perl "92,11,30,92,1,11,92,38,92,92,43,92,92,51,92,36,97,92,92,92,43,22,84,92,92"
>92
perl homework.perl "4,79,89,98,48,42,39,79,55,70,21,39,98,16,96,2,10,24,14,47,0,50,95,20,95,48,50,12,42"
>None
perl homework.perl
>Crashes because I'm too lazy to add an alternative.
https://hastebin.com/ribenivili.pl
▶ No.808715>>808719 >>809087
▶ No.808718
>>808650 (OP)
>women and rust get a 10% bonus
Your 'challenge' gets a 110% penalty for this cancer, OP. Hang your head in shame.
▶ No.808719
>>808715
OP gets the code he deserves in the language he deserves.
At least he would, but I don't know Rust.
▶ No.808720
I'm a black Jewish trans M2F otherkin who identifies as lesbian. I felt unsafe coding in Rust which is a toxic masculine name. Coded in Mathematica because it's proprietary.
majorElement[lst_] :=
SelectFirst[
Tally[lst], #[[2]] > Length[lst]/2 &] /.
{Missing[_] -> "None",
{x_, c_} -> x}
▶ No.808721
lmao @people thinking anyone would come to /tech/ for homework
just post it on cuckoverflow where people routinely solve homework for some upboats
▶ No.808728>>808731
Use a dictionary
O(n), can't do it faster I believe.
▶ No.808731>>808732
>>808728
It can be done in O(1) time you dumb nigger.
▶ No.808732>>808734
>>808731
How? **You dumb nigger*
▶ No.808734>>808735
>>808732
O(k) is asymptotically equivalent to O(1) for any constant k you double nigger.
▶ No.808735>>808749
>>808734
O(L) then. Happy faggot? Learn standard notation, n is the length of the sequence
▶ No.808749>>808753
>>808735
Please give me a value of n where O(n) wouldn't be a constant.
▶ No.808753>>808757 >>808761 >>808770
>>808749
n=a variable=the length of the sequence
You're insane. Now show me how to do it in constant time rather than linear, faggot
▶ No.808757
>>808753
no you double faggot
▶ No.808761
>>808753
Impossible. Only alternative to O(n) would be probabilistic (Monte Carlo) methods up to some desired confidence interval.
▶ No.808770
>>808753
>n=a variable=the length of the sequence
No, nigger, that's L.
>show me how to do it in constant time
It's trivial. Do you even know what constant time means?
▶ No.808771
jesus christ stop taking such obvious bait
▶ No.808788>>808791 >>808795
>>808650 (OP)
>N is in range [0, 100]
this is retarded just create an array of 100 counters and record the largest one.
That's it, now do your fucking homework.
▶ No.808791>>809453
>>808788
> 0 to 100
> create 100 counters
You don’t deserve those dubs.
▶ No.808794
I can do it in O(\alpha n):
Get a hash table where putting shit in each bin is O(\alpha)
Put every item there.
I assume the hash table has a "majorest element" and "amount of times it appears" that gets updated when you put the element in each bin (which stores how many times the element appeared).
Then take the majorest element of the hash table.
My solution is "faster" than >>808653 because I don't do the nlog(n) bit. All you have to do is control \alpha and you're golden (most times \alpha won't go over a constant). Just have a good hash function that doesn't put everything in the same bin.
▶ No.808795>>808801
>>808788
Right. That there is the hash function. Then my answer is O(L).
▶ No.808801>>808802
>>808795
It's O(n) you dumb nigger.
▶ No.808802
>>808801
>can't do it in constant time
>calls others dumb niggers
▶ No.808805>>808808
function major(L)
local M = {}
local max_n = nil
local max_count = 0
local length = 0
for n in L do
M[n] = (M[n] or 0) + 1
length = length + 1
end
for n,count in pairs(M) do
if count > max_count then
max_n = n
max_count = count
end
end
if max_count > length/2 then
return max_n
end
end
for line in io.lines(arg[1]) do
print(major(string.gmatch(line, "%d+")) or "None")
end
Usage:
$ cat data
92,19,19,76,19,21,19,85,19,19,19,94,19,19,22,67,83,19,19,54,59,1,19,19
92,11,30,92,1,11,92,38,92,92,43,92,92,51,92,36,97,92,92,92,43,22,84,92,92
4,79,89,98,48,42,39,79,55,70,21,39,98,16,96,2,10,24,14,47,0,50,95,20,95,48,50,12,42
$ luajit shit.lua data
19
92
None
▶ No.808808
>>808805
>helping pajeet get hired for a job he isn't remotely qualified for
▶ No.808872>>808894 >>809100 >>809238 >>809255 >>809266 >>809648
Gotta go fast, faggots. Also, I identify as I woman.
#include <unistd.h>
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define MAX_L 30000
#define SEPARATOR_LENGTH 1
#define EOL_LENGTH 1
#define MAX_N_LENGTH 1
#define MAX_CASES 40
#define MAX_LINE_LENGTH ((SEPARATOR_LENGTH * (MAX_L - 1)) + (MAX_N_LENGTH * MAX_L) + EOL_LENGTH)
#define MAX_FILE_LENGTH MAX_LINE_LENGTH * MAX_CASES
char buffer[MAX_FILE_LENGTH + 1];
uint_fast64_t counts[100];
uint_fast64_t const tens[] = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90};
inline uint_fast64_t parse_n(char * restrict * restrict c) {
uint_fast64_t retval = 0;
char *begin = *c;
while(**c >= '0' && **c <= '9') {
++*c;
}
int len = *c - begin;
switch(len) {
case 3: retval = 100; break;
case 2: retval = tens[begin[len - 2] - '0'];
case 1: retval += begin[len - 1] - '0';
}
return retval;
}
int main(int argc, char *argv[]) {
uint_fast64_t n;
uint_fast16_t l;
int i;
int fd = open(argv[1], O_RDONLY);
(void) read(fd, buffer, MAX_FILE_LENGTH);
char *c = buffer;
goto start;
for(; *c; ++c) {
(void) memset(counts, 0, sizeof(counts));
start:
for(l = 1; ; ++l, ++c) {
n = parse_n(&c);
++counts[n];
if(*c == '\n') break;
}
uint_fast16_t half = l >> 1;
for(i = 0; i < 100; ++i) {
if(counts[i] > half) {
printf("%d\n", i);
break;
}
}
if(i == 100) printf("None\n");
}
}
It could go faster with pthreads but it's just extra bulk for what would be a pretty boring split of a buffer across threads.
▶ No.808894>>808911
>>808872
You need 101 bins, faggot.
▶ No.808911>>808928
>>808894
No, I identify as a woman, I can get away with it.
▶ No.808928
>>808911
I am so sorry. Will educate myself to be aware of my own privilege. I will learn from this teaching experience.
▶ No.809087>>809089
>>808715
Perl beautiful language
perl -nE 'chomp;@a=split/,/;%c;$c{$_}++for@a;say((grep{$c{$_}>=~~@a/2}@a)[0]||None)' cocks
I'm not gay
▶ No.809089
>>809087
Post the source not the binary, dumbass.
▶ No.809100>>809107
>>808872
[code]
> goto start;
> for(; *c; ++c) {
> (void) memset(counts, 0, sizeof(counts));
>start:
What is going on here?
▶ No.809107>>809111
>>809100
I'm skipping the first initialization since it's guaranteed to start zeroed. Jumping midway into a loop on the first iteration is generally a cleaner way to deal with that.
▶ No.809111
>>809107
Argh, I need to sleep, I swear that { wasn't there when I first looked.
▶ No.809140>>809141
Bump. Do something for once, /tech/.
▶ No.809141
>>809140
Even 4/g/ is better at programming than /tech/ lmao
▶ No.809232>>809255
import argparse
from multiprocessing import Pool
def findMajors(sequence):
sequence=sequence.split(',')
count_list=[]
for integer in sequence:
count=1
for search in sequence:
if integer==search:
count+=1
count_list.append(count)
major_index=[]
index=0
for count in count_list:
if count>len(sequence)/2:
if index not in major_index:
major_index.append(index)
index+=1
major_integers=[]
for major in major_index:
if sequence[major] not in major_integers:
major_integers.append(sequence[major])
return major_integers
def worker_process(sequence):
major_integers=findMajors(sequence)
if major_integers==['']:
print 'None'
else:
for major in major_integers:
print major
def main():
parser=argparse.ArgumentParser()
parser.add_argument('filename',type=str)
args=parser.parse_args()
try:
sequences=open(args.filename,'r').read().split('\n')
except:
print 'None'
p=Pool(len(sequences))
p.map(worker_process,sequences)
if __name__=="__main__":
main()
▶ No.809238>>809245
>>808872
g++ challenge.c -o a.out
challenge.c:21:46: error: expected ‘,’ or ‘...’ before ‘*’ token
inline uint_fast64_t parse_n(char * restrict * restrict c) {
^
challenge.c: In function ‘uint_fast64_t parse_n(char*)’:
challenge.c:24:18: error: ‘c’ was not declared in this scope
char *begin = *c;
^
challenge.c: In function ‘int main(int, char**)’:
challenge.c:54:21: error: cannot convert ‘char**’ to ‘char*’ for argument ‘1’ to ‘uint_fast64_t parse_n(char*)’
n = parse_n(&c);
▶ No.809245>>809251
▶ No.809251>>809252 >>809253
>>809245
gcc challenge.c
In function `main':
challenge.c:(.text+0x87): undefined reference to `parse_n'
collect2: error: ld returned 1 exit status
▶ No.809252>>809253
>>809251
remove the 'inline' from the start of parse_n's definition.
▶ No.809253>>809259
>>809251
Why are you even on this board if you don't know shit about technology?
>$ gcc anon.c -std=c99 -O2
>>809252
No retard.
▶ No.809255>>809770
>>808872
Amazing, doubt anyone can compete with this
>>809232
Broken and busted
▶ No.809259
>>809253
>he doesn't use C89
hipster.
▶ No.809260
[[$]] major_search.
$<[[containers]]::*.
$<[[iterators]]::while,break.
$<[[stdio]]::[File printf].
$<[[string]]::String.
$::main<%({argc: i32 argv: u8*[]} i32 ->
argc!=2? $::printf{"Usage: %s [file]" argv[0]}.
list_count := 0.
new counts: $::HashMap(i32 i32).
f := $::File::open argv[1].
f::seek 0.
str := $::String::new{f::read f::tell}.
f::close!.
$::while %(bool -> !str::empty!) %(->
pos := str::find','.
!pos? $::break!.
num: i32 = str::substr{0, pos}::to_i32!.
!counts::contains{num}?
counts::set{num 1} :
counts::set{num counts::get{num}+1}.
str::shift pos+1.
list_count++.
).
counts::map %({key: i32 val: i32} ->
val >= list_count/2? (
$::printf{"%d\n" key}. $=0.
).
).
$::printf"None\n".
0.
).
▶ No.809276>>809288 >>809370 >>810399 >>811220
#!/usr/bin/env python3
import sys
from collections import Counter
def sequences(filename):
with open(filename, "r") as f:
for line in f:
yield [int(i) for i in line.split(",")]
def major(sequence):
if not sequence:
return None
element, count = Counter(sequence).most_common()[0]
if count <= (len(sequence) / 2):
return None
return element
def main():
for sequence in sequences(sys.argv[1]):
print(major(sequence))
if __name__ == "__main__":
main()
▶ No.809279>>809285 >>809307
>retards actually doing this faggots homework
lmao
▶ No.809285
>>809279
I encourage retards to get by through stealing homework, they'll inevitably fail their exams.
▶ No.809288>>809377
>>809276
gib terminal configuration
▶ No.809298>>809569 >>809578 >>809592
#!/bin/python2.7
import argparse
import sqlite3
from multiprocessing import Pool
def dowork(sequence):
conn=sqlite3.connect(':memory:')
cursor=conn.cursor()
seq_list=sequence.split(',')
target_length=(len(seq_list)/2)
cursor.execute ('CREATE TABLE IF NOT EXISTS challenge (integers integer)')
for integer in seq_list:
cursor.execute('INSERT INTO challenge VALUES (?)',(int(integer),) )
conn.commit()
cursor.execute('SELECT integers, COUNT(*) FROM challenge GROUP BY integers HAVING COUNT(*) > (?)',(target_length,))
fetch=cursor.fetchone()
if fetch!=None:
return fetch[0]
else:
return 'None'
cursor.execute('DROP TABLE challenge')
conn.commit()
conn.close()
def main():
parser=argparse.ArgumentParser()
parser.add_argument('filename',type=str)
args=parser.parse_args()
try:
sequences=open(args.filename,'r').read().rstrip().split('\n')
except:
print 'None'
p=Pool(8)
majors=p.map(dowork,sequences)
for major in majors:
print major
if __name__=="__main__":
main()
▶ No.809307>>809313 >>809317 >>809319 >>809329 >>809705
>>809279
>tfw someone actually gets assigned this retarded exercises
Am I the only Europoor CS student here? We barely see code, here, we mostly focus on the logical/mathematical aspect of things.
▶ No.809313
>>809307
Which is why you'll be able to use ANY language when you graduate.
▶ No.809316
Wow, you're making me realize just how much I suck a programming OP.
I tried doing this in C and I'm struggling to parse the fucking input.
I guess this is what happens when you only do algorithms (which is what you should be learning in school).
Also here's a hint to the people in this thread.
If you're using a hash table or any kind of bucket you're probably doing it wrong.
▶ No.809317
>>809307
Nope, "computer science" was considered a branch of mathematics when I went to school.
▶ No.809319
>>809307
Europoor here, we did Java in the first year and Assembly, Haskell and Prolog in the second. Half of each year and the whole of year 3 got delegated to theory, and we picked our own programming languages to fuck around with. The problem is that we all picked different languages to use primarily, and there were a lot of arguments on Python2 vs 3, PHP vs Perl and Java vs C#.
Theory is great, but you have to get your hands dirty with code too.
▶ No.809324>>809390 >>809397 >>809494 >>811220
use std::collections::HashMap;
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let mut args = std::env::args_os().skip(1);
assert_eq!(args.len(), 1, "exactly 1 arg you stupid nigger");
let file = File::open(args.next().unwrap()).expect("i can't open this file you retarded fag");
let reader = BufReader::new(file);
let mut map = HashMap::new();
for line in reader
.lines()
.map(|r| r.expect("your fucking shit isnt utf8")) {
let mut count = 0;
for i in line.split(',')
.map(|s| s.parse::<u32>().expect("hitler dindu nuffin")) {
*map.entry(i).or_insert(0) += 1;
count += 1;
}
if let Some((k, _)) = map.drain()
.filter(|&(_, v)| v >= count / 2)
.max_by_key(|&(_, v)| v) {
println!("{}", k);
} else {
println!("None");
}
}
}
▶ No.809329
>>809307
>We barely see code, here, we mostly focus on the logical/mathematical aspect of things.
That's because Euro CS is dead. In the '90s, you used to be all code and would do class projects as an entire class led by the professor. It's why Euro CS grads were so good back then. Today, you have the standard post-modern Jewish CS education where you learn no CS and produce CS Grads.
▶ No.809352>>809361 >>809370 >>811220
#include <stdio.h>
#include <stdlib.h>
#define L_MAX 30000
struct element {
int value;
int n;
};
struct element_list {
struct element elements[L_MAX];
size_t index;
size_t max_index;
};
void
initElementList(struct element_list *elist)
{
elist->index = 0;
elist->max_index = 0;
}
int
findElementByValue(int value, struct element_list *elist)
{
int i;
for (i = 0; i < elist->max_index; i++) {
if (elist->elements[i].value == value)
return i;
}
return -1;
}
void
addValueToList(int value, struct element_list *elist) {
int idx;
if ((idx = findElementByValue(value, elist)) == -1) {
elist->elements[elist->max_index].value = value;
elist->elements[elist->max_index].n = 1;
elist->max_index++;
} else {
elist->elements[idx].n++;
}
}
int
findMajorElement(struct element_list *elist)
{
int i;
int max = 0;
int max_index = 0;
if (elist->max_index == 0) return -1;
for (i = 0; i < elist->max_index; i++) {
if (elist->elements[i].n > max) {
max = elist->elements[i].n;
max_index = i;
}
}
if (max > (elist->max_index / 2)) {
return elist->elements[max_index].value;
}
return -1;
}
int
main (int argc, char *argv[])
{
FILE *fp;
char c;
char buf[3];
size_t buf_idx = 0;
struct element_list elist;
int major;
initElementList(&elist);
if (argc == 1) {
printf("No input file.\n");
} else {
while (--argc > 0) {
if (!(fp = fopen(*++argv, "r"))) {
printf("Can not open file.");
exit(1);
} else {
while ((c = getc(fp)) != EOF) {
if ((c >= '0') && (c <= '9')) {
buf[buf_idx] = c;
buf_idx++;
} else if ((c == ',') || (c == '\n')) {
if (buf_idx <= 2) buf[buf_idx] = '\0';
buf_idx = 0;
addValueToList(atoi(buf), &elist);
if (c == '\n') {
if ((major = findMajorElement(&elist)) != -1) {
printf("%d\n", major);
} else {
printf("None\n");
}
initElementList(&elist);
}
} else if (c == ' ') {
/* Ignore spaces. */
} else {
printf("Unexpected character %c.\n", c);
}
}
fclose(fp);
}
}
}
return 0;
}
From 0 to pajeet how bad is my code?
▶ No.809361
>>809352
>calling an array a list
>O(N) insertion to array when the problem requires an int n->int m map and n can only have 101 possible values
>mixed bracing
>function types on the previous line
>using a 3-char buf instead of replacing ',' and '\0' and reusing parts of the input string smh
>getc()'s return assigned to a char
>trying to printf("%c") the -1 you'll get on from getc() on EOF
0.2pajeet
I hate you
▶ No.809370
>>809276
pic related
>>809352
Failed
▶ No.809377
>>809288
Xresources
URxvt.perl-ext:
URxvt.perl-ext-common: selection-to-clipboard
#define termfont xft:tewi:pixelsize=13:antialias=false
URxvt.font: termfont
URxvt.boldFont: termfont
URxvt.intensityStyles: false
URxvt.boldMode: false
URxvt.skipBuiltinGlyphs: true
URxvt.loginShell: true
URxvt*title: urxvt
URxvt*termName: rxvt-unicode
URxvt*urgentOnBell: true
URxvt.letterSpacing: 1
URxvt.borderless: true
URxvt.internalBorder: 20
URxvt.scrollBar: false
URxvt.scrollBar_right: false
URxvt.scrollstyle: plain
URxvt.jumpScroll: false
URxvt.scrollWithBuffer: true
URxvt.scrollTtyKeypress: true
URxvt.scrollTtyOutput: false
URxvt.buffered: true
URxvt.pointerBlank: true
URxvt.underlineURLs: true
URxvt.saveLines: 2048
URxvt.url-select.launcher: /opt/waterfox/waterfox-bin
URxvt.url-select.underline: true
URxvt.iso14755: false
URxvt.iso14755_52: false
Xft.antialias: true
Xft.autohint: false
Xft.dpi: 96
Xft.hinting: true
Xft.hitstyle: hintslight
Xft.lcdfilter: lcddefault
Xft.rgba: rgb
Xcursor.size: 15
URxvt.utf8: true
URxvt.locale: true
! X colors.
! Generated by 'wal'
emacs*foreground: #BDBBD2
emacs*background: #1D1C22
URxvt*foreground: #BDBBD2
XTerm*foreground: #BDBBD2
UXTerm*foreground: #BDBBD2
URxvt*background: [100]#1D1C22
XTerm*background: #1D1C22
UXTerm*background: #1D1C22
URxvt*cursorColor: #BDBBD2
XTerm*cursorColor: #BDBBD2
UXTerm*cursorColor: #BDBBD2
URxvt*borderColor: [100]#1D1C22
! Colors 0-15.
*.color0: #1D1C22
*color0: #1D1C22
*.color1: #588877
*color1: #588877
*.color2: #606F8B
*color2: #606F8B
*.color3: #D060AE
*color3: #D060AE
*.color4: #19A494
*color4: #19A494
*.color5: #609E9B
*color5: #609E9B
*.color6: #43DBD4
*color6: #43DBD4
*.color7: #BDBBD2
*color7: #BDBBD2
*.color8: #77767a
*color8: #77767a
*.color9: #588877
*color9: #588877
*.color10: #606F8B
*color10: #606F8B
*.color11: #D060AE
*color11: #D060AE
*.color12: #19A494
*color12: #19A494
*.color13: #609E9B
*color13: #609E9B
*.color14: #43DBD4
*color14: #43DBD4
*.color15: #BDBBD2
*color15: #BDBBD2
! Black color that will not be affected by bold highlighting.
*.color66: #1D1C22
*color66: #1D1C22
! Rofi colors.
rofi.color-window: #1D1C22, #1D1C22, #606F8B
rofi.color-normal: #1D1C22, #BDBBD2, #1D1C22, #606F8B, #1D1C22
rofi.color-active: #1D1C22, #BDBBD2, #1D1C22, #606F8B, #1D1C22
rofi.color-urgent: #1D1C22, #588877, #1D1C22, #588877, #BDBBD2
! Xclock colors.
XClock*foreground: #BDBBD2
XClock*background: #1D1C22
XClock*majorColor: rgba:bd/bb/d2/ff
XClock*minorColor: rgba:bd/bb/d2/ff
XClock*hourColor: rgba:bd/bb/d2/ff
XClock*minuteColor: rgba:bd/bb/d2/ff
XClock*secondColor: rgba:bd/bb/d2/ff
! Set depth to make transparency work.
URxvt*depth: 32
font: https://github.com/Lucy/tewi-font
▶ No.809390>>809400 >>809509 >>811220
>>809324
The same thing without the hashmap
use std::fs::File;
use std::io::{BufRead, BufReader};
use std::env::args_os;
fn main() {
let filename = args_os().skip(1).next().expect(
"exactly 1 arg you stupid nigger",
);
let reader = BufReader::new(File::open(filename).expect(
"i can't open this file you retarded fag",
));
for line in reader.lines().map(
|r| r.expect("your fucking shit isnt utf8"),
)
{
let mut buf: [u32; 101] = [0; 101];
let mut cnt = 0;
for n in line.split(',') {
buf[n.parse::<usize>().expect("hitler dindu nuffin")] += 1;
cnt += 1;
}
match buf.into_iter().position(|x| *x > cnt / 2) {
Some(n) => println!("{}", n),
_ => println!("None"),
}
}
}
▶ No.809397
>>809324
Now we need the Ada version.
▶ No.809400>>809404
>>809390
>expect("error message")
expect
the English word "expect"
used for that
<triggered
▶ No.809453>>809475
>>808791
Alright, you win. 99 then.
▶ No.809475>>809482
>>809453
101 my black friend.
▶ No.809482>>809485
>>809475
I bet you were the brightest kid in the short bus.
▶ No.809485>>811089
>>809482
Thank me for bothering to correct you.
▶ No.809494>>809537
>>809324
> let mut args = std::env::args_os().skip(1);
▶ No.809509
>>809390
So how does it perform?
▶ No.809537
▶ No.809569>>809574
>>809298
i wonder why this isn't faster than it is. it's not much faster than my straight python solution, isn't the sqlite3 module for python C? All the work here should be done by sqlite3 but it's still slow.
▶ No.809574
>>809569
Databases are ridiculously fucking slow, anon. And sqlite3 is slow for a database.
▶ No.809578>>809582 >>809592
>>809298
There's up to 30,000 numbers in each test right? You're probably spending a lot of time inserting each row individually. Look into doing a bulk insert, or possibly grouping those inserts into a single transaction. Of course this is just conjecture, and you should profile to see where time is being spent.
▶ No.809582>>809587 >>809591 >>809648 >>809687 >>809774 >>810378
>>809578
A script to create test data if anyone wants it.
#!/usr/bin/python
from random import random
for t in xrange(40):
l = 10000 + int((30000 - 10000 + 1) * random())
bias = 10 * random()
result = []
for i in xrange(l):
n = int(random()**bias * (100 + 1))
result.append(n)
print ",".join([str(x) for x in result])
▶ No.809587>>809591
>>809582
i was _just_ about to write one myself. thx
▶ No.809591>>809593
>>809582
>>809587
same, already did too
import random
string=''
row_count=40
integer_count=50000
row_counter=0
while(row_counter<row_count):
integer_counter=0
while(integer_counter<integer_count):
string+=str(random.randint(0,100))+','
integer_counter+=1
string=string[:-1]+'\n'
row_counter+=1
string=string[:-1]
f=open('testdata','w')
f.write(string)
f.close()
▶ No.809592>>809602 >>809666 >>809825 >>810094
>>809578
doing a batch insert with executemany instead of one by one improved performance by 500%.
testing on 40 rows with 50k integers each:
10-11s with
>>809298
2-2.4s with
#!/bin/python2.7
import argparse
import sqlite3
from multiprocessing import Pool
def dowork(sequence):
conn=sqlite3.connect(':memory:')
cursor=conn.cursor()
seq_list=sequence.split(',')
target_length=(len(seq_list)/2)
seq_list_int=[]
for integer_str in seq_list:
seq_list_int.append( ( int(integer_str),) )
cursor.execute ('CREATE TABLE challenge (integers integer)')
cursor.executemany('INSERT INTO challenge VALUES (?)',seq_list_int)
conn.commit()
cursor.execute('''SELECT integers,max(rowid) FROM challenge
GROUP BY integers HAVING COUNT(1) > (?) LIMIT 1''',(target_length,))
fetch=cursor.fetchone()
if fetch!=None:
return fetch[0]
else:
return 'None'
cursor.execute('DROP TABLE challenge')
conn.commit()
conn.close()
def main():
parser=argparse.ArgumentParser()
parser.add_argument('filename',type=str)
args=parser.parse_args()
try:
sequences=open(args.filename,'r').read().rstrip().split('\n')
except:
print 'None'
p=Pool(8)
majors=p.map(dowork,sequences)
for major in majors:
print major
if __name__=="__main__":
main()
▶ No.809593
>>809591
>integer_count=50000
>L is in range [10000, 30000]
wat
▶ No.809600>>809606 >>809627
>>808650 (OP)
IMPORTANT: OP PLEASE SPECIFY
are you saying that there is only one element which appears more than L/2 times?
▶ No.809602>>809605 >>809611 >>809666
>>809592
using list comprehensions to generate the tuple list being fed into the table is another 20% performance boost, now 1.8-2.0s instead of 2-2.4
def dowork(sequence):
conn=sqlite3.connect(':memory:')
cursor=conn.cursor()
seq_list=sequence.split(',')
seq_list=[(x,) for x in seq_list ]
target_length=(len(seq_list)/2)
cursor.execute ('CREATE TABLE challenge (integers integer)')
cursor.executemany('INSERT INTO challenge VALUES (?)',seq_list)
conn.commit()
cursor.execute('''SELECT integers,max(rowid) FROM challenge
GROUP BY integers HAVING COUNT(1) > (?) LIMIT 1''',(target_length,))
fetch=cursor.fetchone()
if fetch!=None:
return fetch[0]
else:
return 'None'
cursor.execute('DROP TABLE challenge')
conn.commit()
conn.close()
▶ No.809605
>>809602
I thought i screwed up here
seq_list=[(x,) for x in seq_list ]
should have been
seq_list=[(int(x),) for x in seq_list ]
but apparently it doesn't matter, sqlite will do the conversion if it can, which is probably part of the 20% performance increase.
▶ No.809606>>809622 >>809627
>>809600
How can there be more than one, Pajeet?
▶ No.809611>>809614
>>809602
cursor.execute('DROP TABLE challenge')
conn.commit()
conn.close()
Those lines are unreachable, aren't they?
▶ No.809614
>>809611
ya that's another fuckup, that should go above
if fetch!=None:
i don't know how necessary it even is, python probably closes it and garbage collects it anyway, in theory.
▶ No.809615>>809636 >>809648
i made the fastest version:
extern crate memchr;
extern crate memmap;
extern crate rayon;
use std::cmp::{min};
use memchr::{memchr};
use memmap::{Mmap};
use rayon::prelude::*;
const MAX_TEST_CASES: usize = 40;
const MAX_N: usize = 100;
const MIN_L: usize = 10000;
const MAX_L: usize = 30000;
const MIN_LINE_LEN: usize = MIN_L + MIN_L - 1;
const MAX_LINE_LEN: usize = MAX_L * 3 + MAX_L - 1;
const MAX_TESTS_LEN: usize = MAX_LINE_LEN * MAX_TEST_CASES + MAX_TEST_CASES;
fn main() {
let mut args = std::env::args_os().skip(1);
assert_eq!(args.len(), 1);
let mmap = Mmap::open_path(args.next().unwrap(), memmap::Protection::Read).unwrap();
let slice = unsafe { mmap.as_slice() };
for &r in do_it(slice).iter().filter(|&r| r.is_some()) {
match r.unwrap() {
Some(v) => { println!("{}", v); }
None => { println!("None"); }
}
}
}
fn do_it(slice: &[u8]) -> [Option<Option<usize>>; MAX_TEST_CASES] {
debug_assert!(slice.len() <= MAX_TESTS_LEN);
let mut results = [None; MAX_TEST_CASES];
split_line(slice, &mut results);
results
}
fn split_line(slice: &[u8], results: &mut [Option<Option<usize>>]) {
if slice.is_empty() {
return;
}
debug_assert!(slice.len() >= MIN_LINE_LEN);
let (r, rs) = results.split_first_mut().unwrap();
match memchr(b'\n', &slice[MIN_LINE_LEN..min(slice.len(), MAX_LINE_LEN + 1)]) {
Some(i) => {
let (sl, sr) = slice.split_at(MIN_LINE_LEN + i);
rayon::join(|| process_line(sl, r), || split_line(&sr[1..], rs));
}
None => { rayon::join(|| process_line(slice, r), || {}); }
}
}
fn process_line(slice: &[u8], result: &mut Option<Option<usize>>) {
debug_assert!(slice.len() >= MIN_LINE_LEN);
debug_assert!(slice.len() <= MAX_LINE_LEN);
let (c, m) = slice.par_split(|&b| b == b',')
.map(|b| parse(b))
.fold(|| (0, [0; MAX_N + 1]), |(c, mut m), i| { m[i] += 1; (c + 1, m) })
.reduce(|| (0, [0; MAX_N + 1]), |(c1, mut m1), (c2, m2)| { for i in 0..MAX_N + 1 { m1[i] += m2[i]; } (c1 + c2, m1) });
match m.par_iter().cloned().enumerate().find_any(|&(_, v)| v >= c / 2) {
Some((i, c)) => {
debug_assert!(c >= MIN_L);
debug_assert!(c <= MAX_L);
*result = Some(Some(i));
}
None => { *result = Some(None); }
}
}
fn parse(b: &[u8]) -> usize {
debug_assert!(b.iter().all(|&b| b >= b'0' && b <= b'9'));
debug_assert!(std::str::from_utf8(b).unwrap().parse::<usize>().unwrap() <= MAX_L);
(match b.len() {
1 => { b[0] - b'0' }
2 => { (b[0] - b'0') * 10 + (b[1] - b'0') }
3 => { 100 }
_ => { unreachable!() }
}) as usize
}
▶ No.809622>>809627 >>809628
>>809606
sheeeeeeiiiiiit im retarded
▶ No.809627
>>809606
>>809600
>>809622
that was in the back of my mind too until i thought about it for a second
▶ No.809628>>810021
>>809622
We all have our moments of diversity.
▶ No.809636>>809638 >>809640 >>809643
>>809615
How do you run this shit? I tried running it and got:
thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted
Also, how does one even know what source is being pulled into these rust projects?
Downloading rayon-core v1.2.1
Downloading futures v0.1.16
Downloading coco v0.1.1
Downloading rand v0.3.17
Downloading lazy_static v0.2.9
Downloading num_cpus v1.7.0
Downloading scopeguard v0.3.3
Downloading either v1.3.0
Downloading memmap v0.5.2
Downloading libc v0.2.32
Downloading memchr v2.0.0
Has anyone ever looked at what these are? How many are from complete nobodies with no oversight and no maintenance? How do I get the exact source that the version installed corresponds to to see if they're evil?
▶ No.809638>>809639
>>809636
holy fuck you stupid nigget increase you stack size limit
▶ No.809639
>>809638
I have no idea how to make your shitty language work. If you need stupid amounts of ram or whatever, say what options to build with.
▶ No.809640
>>809636
look into the target folder. cargo caches the crates there
▶ No.809643>>809644
>>809636
>rust happily pulls in random sources from random places
great language steve
▶ No.809644>>809650
>>809643
>random sources
the Cargo.toml tells you exactly what gets pulled
>random places
it pulls the requested crates from crates.io
very low energy bait.
▶ No.809648>>809652 >>809704 >>809770
>>809615
>i made the fastest version
dohoho, we'll see about that.
I built your shit on an i7-6700k in --release mode and ran the binary directly rather than via cargo using test data from >>809582 and compared it to mine in >>808872 . If there's some other bullshit I need to do to make rust go fast, tell me. Otherwise,
Yours:
real 0m0.020s
user 0m0.108s
sys 0m0.016s
Mine:
real 0m0.009s
user 0m0.008s
sys 0m0.000s
You got beat by someone who identifies as a girl for purposes of determining score! Girl power!
And not just beat, fucking destroyed. Let's look closer at that. You used 8 threads (6700k) against a single-threaded program and still lost. Not only did you lose, you managed to waste 0.108s of CPU time for a 0.020s runtime. So in reality rather than an artificial test where your program would be competing for CPU time with other programs, it's more accurate to say you need 108ms of CPU time to produce a result vs 8ms of CPU time. You also used 3x the ram. gj Rust community!
▶ No.809650>>809653 >>809671
>>809644
Have you looked at any of it? I looked at what was being pulled in via rayon just now and they're unsafely trusting the environment. I can make any Rust program using that crate crash if I can get "RAYON_NUM_THREADS" into the environment. Did the Rustfags learn nothing from shellshock? Fucking christ, this is like a '90s tier exploit. Webdevs have no business writing systems code.
▶ No.809652>>809654
>>809648
fugg XDDDD
my rust version is really way slower
▶ No.809653
>>809650
Also, the funny thing is rayon disables checking for RAYON_LOG in release builds as they recognize trusting the environment is unsafe (I was hoping I could overwrite your /etc/passwd), but they neglected to do that for their other environment variables. Check log.rs and lib.rs. I wonder how much of the Rust ecosystem is this insecure, or outright malicious, since no one checks.
▶ No.809654
>>809652
shit i really fucked up bigly. for some reason this shit doesnt even utilize all my cores.
well at least is is memory and thread safe XDDDDDDD
▶ No.809666>>809670
>>809592
with the modified function from >>809602
▶ No.809670>>809672
>>809666
where are these tests coming from?
▶ No.809671>>809680
>>809650
>trusting the environment is bad
What the fuck am I reading? If you can't trust the environment you have bigger problems than someone setting RAYON_NUM_THREADS to over 9000.
▶ No.809672>>809677
>>809670
from some pajeet code challenge/practice website thing
▶ No.809677
>>809672
>can't solve the challenge
>has an autism fit when other people solve it
>calls everything pajeet
stay mad kiddo
▶ No.809680>>809686
>>809671
It's like you've already forgotten the lessons re-learned from shellshock. The environment should not be trusted. There are too many ways to get information into it, and it's led to hundreds of exploits over the years.
▶ No.809686>>809688 >>809710
>>809680
>shellshock
>Many Internet-facing services, such as some web server deployments, use Bash to process certain requests
LOL. bash did nothing wrong. webdevs did
>The environment should not be trusted
>int fd = open(argv[1], O_RDONLY);
LOL. webdev confirmed
>There are too many ways to get information into it, and it's led to hundreds of exploits over the years.
the same can be said about sql queries.
▶ No.809687
>>809582
It's like you don't know Python.
#!/usr/bin/python3
ii=__import__;i=ii("random");r,ri=i.random,i.randint;m=lambda lll:str(int(lll()));[getattr(__builtins__,getattr("","".join([chr(i) for i in sum([[j,j+5] for j in [[k*5+1,k*5] for k in [21]][0]],[])]))([chr(i) for i in [112,114,105,110,116]]))(",".join([getattr(m,"".join([j if chr(99)==j[0] else i for i,j in [[(['__']*3)[i],['wall', 'call', 'mall'][i]] for i in range(3)]]))(lambda:ri(0,100)) for j in range(ri(10000, 30000))])) for i in range(40)]
Sadly can't add bias here, though.
▶ No.809688>>809689
>>809686
>numerous environment based CVEs dating back 20 years
>interpreters like perl, python, php, and ruby provide ways to disable dangerous environment variables
>numerous attacks on suid binaries through environment variables they didn't know they had
>rayon itself disables almost all of its dangerous environment variables in release mode, but misses two
"Nope, must not be a security problem!"
You're a fucking moron, anon. Please never work on anything important.
▶ No.809689>>809691 >>809710
>>809688
lol stop sperging out, webdev. it is not a programs fault if its environment is compromised. maybe webdevs should stop putting malicious stuff into the environment????
>rayon itself disables almost all of its dangerous environment variables in release mode, but misses two
literally the only two environment variables rayon reads are RAYON_NUM_THREADS and RAYON_LOG: https://github.com/rayon-rs/rayon/search?utf8=%E2%9C%93&q=env%3A%3Avar&type=
rayon enables logging only in debug mode or if you enable a compiler flag. it isnt diabled because "muh dangerous env variables".
fucking kill yourself you fag.
▶ No.809691>>809693
>>809689
Hello, eternal September. Enjoy re-learning what we learned the hard way. lol.
▶ No.809693>>809694
>>809691
not an argument, lol. reading environment variables is totally ok and if there is shit in there that isnt supposed to be in there, it isnt my fault, but the webdevs who put it there.
if you dont a particular program using environment variables because you are a webdev and might have put shit into it, just clear them.
▶ No.809694>>809697
>>809693
>if there is shit in there that isnt supposed to be in there, it isnt my fault, but the webdevs who put it there
Quality defense-in-depth on display from the eternal September.
▶ No.809697>>809700
>>809694
>defense-in-depth
holy shit you fucking retard. you won. i wont read env variables anymore. i also wont use command line arguments, files, user input and all that good stuff. after all, if my programm does absolutely nothing than it cant do anything wrong, right?
in fact you advice made even rust obsolete, since accessing memory is insecure.
▶ No.809700>>809701
>>809697
You type like a child.
▶ No.809701
>>809700
amazing argument. you win 10 interwebz
▶ No.809703>>809706 >>809709 >>809731 >>810187 >>811220
#include <iostream>
#include <fstream>
#include <cstring>
#define N 101
int main(int argc, char** argv) {
uint_fast16_t count[N];
std::memset(count, 0, sizeof count);
if (std::ifstream infile{argv[1], std::ios::binary}) {
uint_fast16_t n = 0;
uint_fast16_t L = 0;
while (infile >> n) {
++count[n];
++L;
if (infile.peek() == ',') {
infile.ignore(1);
}
else {
L /= 2;
bool found = false;
for (size_t i = 0; i < N; ++i) {
if (count[i] >= L) {
std::cout << i << "\n";
found = true;
break;
}
}
if (!found) {
std::cout << "None\n";
}
std::memset(count, 0, sizeof count);
L = 0;
}
}
}
}
▶ No.809704>>809707
>>809648
What about memory usage in either of those programs?
▶ No.809705
>>809307
euro cs student here,
people graduating from cs in europoora can't even write a for loop
▶ No.809706
>>809703
i didn't think about doing it that way at all, i'm going to write something similar in python and see how shitty the python performance doing the same type of thing is.
▶ No.809707>>809810
>>809704
I mentioned the core size difference was 3x. The C version is about 3M (the size of the worst-case test file that it reserves space for). The rust version is about 9M but I have no idea how to reason about where the memory is getting used as I don't know rust. It might just be from all the stack segments for the threads.
▶ No.809710>>809716 >>809717
>>809689
>webdevs should stop putting malicious stuff into the environment
they don't. CGI does. Any webserver that implements it correctly will do likewise. It's webdevs who are responsible for being three degrees removed from the people responsible for responsibly handling their environment variables, just like any other program input.
>>809686
>bash did nothing wrong
kill yourself. this was a completely retarded 'feature' that would've been a massive problem even if the web had not existed at all. How common is it for programs to cheap out and invoke a subshell when not strictly necessary? Especially when the command is safe on its face? Bash made that unsafe.
EATSHIT="I don't even remember how this useless 'feature' goes" /usr/bin/some_setuid_admin_bin
^ now any subshells also does an rm -rf as root. Nice fucking job.
▶ No.809716
>>809710
>CGI does
ok. but does cgi write to RAYON_NUM_THREADS or RAYON_LOG?
▶ No.809717
▶ No.809719>>809720
can you two autists go get your own thread where you can hatefuck each other all you want?
▶ No.809720
>>809719
right after you post some code
▶ No.809731>>809735 >>811220
>>809703
Slightly changed to use more idiomatic C++
#include <iostream>
#include <fstream>
#include <array>
#include <algorithm>
#include <iterator>
#define N 101
int main(int argc, char** argv) {
std::array<uint_fast16_t, 101> count;
std::fill(count.begin(), count.end(), 0);
if (std::ifstream infile{argv[1], std::ios::binary}) {
uint_fast16_t n = 0;
uint_fast16_t L = 0;
while (infile >> n) {
++count[n];
++L;
if (infile.peek() == ',') {
infile.ignore(1);
}
else {
L /= 2;
auto loc = std::find_if(count.begin(), count.end(), [&](uint_fast16_t& i) {
return i >= L;
});
if (loc != std::end(count)) {
std::cout << std::distance(count.begin(), loc) << "\n";
}
else {
std::cout << "None\n";
}
std::fill(count.begin(), count.end(), 0);
L = 0;
}
}
}
}
▶ No.809735>>809738
>>809731
i really need to stop wasting time with python.
▶ No.809738>>809744
>>809735
what do you mean?
▶ No.809744>>809746 >>809751
>>809738
i've spent way too long writing everything in python and it would take time now to get back into C. python is great for glue and all the libraries and all that but clearly the performance no matter what you do is going to be 100 times worse than C.
▶ No.809746
>>809744
mfw even perl5 is faster than python
▶ No.809747>>809759 >>809773
Since the rustfag is trying to beat my single-threaded code with threads, I'm going to swoop in here and 'improve' my entry. It is unbeatable without just micro-optimizing this code and resubmitting it. Bow down to your queen, and learn something from my superior technique. Time this with that site of yours, and it better have more than one goddamn core available after I spent time doing this.
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/mman.h>
#include <sched.h>
#include <errno.h>
#include <stdio.h>
#include <pthread.h>
#include <stdint.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define MAX_WORKERS 64
#define MAX_CASES 40
static char *buffer;
static uint_fast16_t const tens[] = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90};
static long processors;
static off_t size;
static inline uint_fast8_t parse_n(char * restrict * restrict c) {
uint_fast8_t retval = 0;
char *begin = *c;
while(**c >= '0' && **c <= '9') {
++*c;
}
int len = *c - begin;
switch(len) {
case 3: retval = 100; break;
case 2: retval = tens[begin[len - 2] - '0'];
case 1: retval += begin[len - 1] - '0';
}
return retval;
}
typedef struct {
pthread_t thread;
char *begin;
char answer[MAX_CASES][5];
int answers;
uint_fast16_t counts[100 + 1];
} __attribute__((aligned(64))) worker_t;
worker_t workers[MAX_WORKERS];
void *thread_start(void *data) {
long processor = (long) data;
worker_t *worker = &workers[processor];
uint_fast8_t n;
uint_fast16_t l;
int i;
char *c = buffer + (size * processor) / processors;
if(c > buffer) {
c = strchr(c - 1, '\n') + 1;
}
if(!*c) return NULL;
worker->begin = c;
char *end;
if(processor < processors) end = buffer + (size * (processor + 1)) / processors;
else end = buffer + size;
goto start;
for(; c < end; ++c) {
(void) memset(worker->counts, 0, sizeof(worker->counts));
start:
for(l = 1; ; ++l, ++c) {
n = parse_n(&c);
++worker->counts[n];
if(*c == '\n') break;
}
uint_fast16_t half = l >> 1;
for(i = 0; i < sizeof(worker->counts) / sizeof(worker->counts[0]); ++i) {
if(worker->counts[i] > half) {
sprintf(worker->answer[worker->answers], "%d", i);
break;
}
}
if(i == sizeof(worker->counts) / sizeof(worker->counts[0])) sprintf(worker->answer[worker->answers], "None");
++worker->answers;
}
return NULL;
}
int main(int argc, char *argv[]) {
struct stat stat;
int fd = open(argv[1], O_RDONLY);
fstat(fd, &stat);
size = stat.st_size;
buffer = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
errno = 0;
processors = sysconf(_SC_NPROCESSORS_ONLN);
if(processors > MAX_WORKERS) processors = MAX_WORKERS;
for(long processor = 0; processor < processors; ++processor) {
pthread_attr_t attr;
pthread_attr_init(&attr);
cpu_set_t cpus;
CPU_ZERO(&cpus);
CPU_SET(processor, &cpus);
pthread_attr_setaffinity_np(&attr, sizeof(cpu_set_t), &cpus);
pthread_create(&workers[processor].thread, &attr, &thread_start, (void *) processor);
}
for(long processor = 0; processor < processors; ++processor) {
worker_t const *worker = &workers[processor];
pthread_join(worker->thread, NULL);
}
for(long processor = 0; processor < processors; ++processor) {
worker_t const *worker = &workers[processor];
if(processor + 1 < processors && workers[processor].begin == workers[processor + 1].begin) continue;
for(int i = 0; i < worker->answers; ++i) {
printf("%s\n", worker->answer[i]);
}
}
return 0;
}
Thing is, if you look close, this is maximum faggotry. I would never write code like this normally. It is highly optimized to have an extremely good wall clock time (how these tests are scored) but will sacrifice CPU time to do it. In the real world where you have other shit running, that's pretty dumb. But winning is everything, so whatever. Some highlights:
- It gambles that the tests are being run in an unclean state and uses mmap() to swipe the data from the fs cache. mmap() is slower in the real world due to the cost of page faults.
- It divides the buffer across threads and has each thread find its starting position independent of the others rather than what you're taught in school: to make a list of the work (e.g., split the lines) then divide it amongst work queues. This is a big optimization for wall-clock time but it means there is overlap and some results might be computed and thrown away (wasted CPU time). Threads will have a less balanced workload with this technique, but it's a huge win for wall clock time not having to scan the entire file on a single thread, so it works out.
▶ No.809751
>>809744
FWIW, I (your queen) heavily use Python, C, and C++. They're all great at their niche. Just don't use the wrong tool for the job.
▶ No.809759>>809761
>>809747
/tmp/ccMmS4zt.o: In function `main':
source.c:(.text+0x364): undefined reference to `pthread_attr_setaffinity_np'
source.c:(.text+0x37c): undefined reference to `pthread_create'
source.c:(.text+0x3b0): undefined reference to `pthread_join'
collect2: error: ld returned 1 exit status
Sorry queen
▶ No.809761>>809762
>>809759
-lpthread, newfag.
▶ No.809762>>809766 >>809767
>>809761
Can't influence switches on the testing site, sorry queer queen
▶ No.809766>>809768 >>809770
>>809762
That's pretty terrible. So it allows threads in C++ and Rust but locks out C? What a Pajeet website.
▶ No.809767>>809771
>>809762
>I would immediately and comfortably use a library for this in any real-world circumstance
>my professor's test environment doesn't include it
quit college. try making demands for my money back no matter how unlikely that is. drop hints about coming back to get a degree in philosophical proctology
>my code wins if you pass -Ogo-faster
>but this testing site only caters to 'default' language features
sell all property
go to renfair
buy bastard sword
travel to physical locations of testing site's admins
"shall we cross steel, churl?"
▶ No.809768
>>809766
I haven't checked if it supports C++ threads.
It doesn't support Rust at all lel
▶ No.809770>>809772
>>809766
Since it locks it out, here's results from a 6700k using that test data from python,
real 0m0.001s
user 0m0.004s
sys 0m0.000s
1 MS
Compare with the single-threaded code I gave numbers for in >>809648,
real 0m0.009s
user 0m0.008s
sys 0m0.000s
So I'd expect if that shitty site could run it, since it was at 3ms in >>809255, it might register as 0ms.
▶ No.809772
>>809770
The site seems to value low memory usage more than cpu usage after a certain threshold for cpu time has been crossed.
▶ No.809773>>809774
>>809747
i changed your single threaded version to also memory map the file and it is as fast as your multithreaded version
▶ No.809774>>809779
>>809773
It needs a large test case to notice a difference. The script in >>809582 should be enough. It scales really well so the larger the data gets the better it'll do.
▶ No.809779>>809780 >>809783
>>809774
lol no i tripled the number of tests and now the single threaded version is faster
▶ No.809780>>809784
>>809779
You might have a shitty CPU, anon. You can try limiting the MAX_WORKERS to half to see if you've got a bad implementation of hyperthreads.
▶ No.809783
>>809779
Also, if you've got something burning CPU for no reason, I set affinity on each thread and the slowest one determines your runtime so that will create a bottleneck. You can comment out the affinity stuff and have the scheduler route around that, but it's less efficient.
▶ No.809784>>809791
>>809780
i tried 32, 16, 8 and 4 workers. single threaded version is still better.
▶ No.809790>>809793 >>809794
Queen,
what is your dayjob? Are you from NA or EU?
▶ No.809791>>809792
>>809784
It points to a problem with your setup. The bottleneck code is essentially the same other than setting affinity to run on a core the scheduler might not have otherwise chosen. You should look into it further.
▶ No.809792>>809800
>>809791
no my setup is fine. it is just that your multi threaded version is shit. modify the single threaded to use mmap and see for yourself
▶ No.809793>>809795
>>809790
I do low-level networking on embedded devices and gamedev. Simultaneously. It's been a strange ride. I'm currently working 12 hours a day.
▶ No.809794>>809797
>>809790
Also, in socal. And I've not slept yet. It's 8:30am..
▶ No.809795>>809796 >>809801
>>809793
>working 12 hours a day
>still shitposts on /tech/
i smell LARPing
▶ No.809796>>809799
>>809795
i doubt a LARPer could produce the C source in this thread
▶ No.809797>>809804
>>809794
>socal
are you also actually a girl, because that'd be hilarious
▶ No.809799
>>809796
i mean it really isnt that hard. if you cant do that you should consider enrolling into a cs course
▶ No.809800>>809802
>>809792
Modified to use mmap:
real 0m0.008s
user 0m0.008s
sys 0m0.000s
Again, your setup is shit. Like I said, there's no significant change in the bottleneck. It's purely going to be an issue with affinity that you can solve.
▶ No.809801>>809803
>>809795
I had today off to deal with an immigration interview for my wife (she's not mail order, just Canadian).
▶ No.809802>>809805
>>809800
no you fag. my setup is fine. your shitty multithreaded version just sucks.
▶ No.809803
>>809801
>canadians cant be mail order
▶ No.809804>>809806
>>809797
No, I'm a 40 year old WHITE MALE.
▶ No.809805
▶ No.809806
>>809804
Hi i'm tyrone, want me to take care of your wife?
▶ No.809810>>809821
>>809707
B-b-but muh SAFE memory usage!!!! It's safe.
▶ No.809821
>>809810
yes it is safe. when i wrote this piece of shit rust code i intended it go fast and didnt give a shit about memory usage. unfortunately i didnt benchmark this trashy codevomit before i proudly presented it to you.
▶ No.809823
>>809815
>50%+1 times, only use half of the sequence.
No, you cannot assume that.
▶ No.809825>>809826 >>809827 >>809829 >>809835 >>809856 >>810015
i deleted the last post because i accidentally posted the wrong fucked up code. this is the correct code, and time is down to 1.0-1.2s compared to 2.2s with
>>809592
still obviously shit compared to these C solutions, it's 1.8s with 1 process.
the improvement here is get rid of half of the sequence immediately, since the number is guaranteed to be in 50%+1 of that sequence.
#!/bin/python2.7
import argparse
import sqlite3
from multiprocessing import Pool
def dowork(sequence):
conn=sqlite3.connect(':memory:')
conn.execute ('PRAGMA journal_mode = OFF')
conn.execute ('PRAGMA synchronous = OFF')
conn.execute ('PRAGMA locking_mode = EXCLUSIVE')
cursor=conn.cursor()
seq_list=sequence.split(',')
target_length=(len(seq_list)/2)
#drop half of the sequence, unnecessary
seq_list=seq_list[:target_length]
seq_list=[(x,) for x in seq_list ]
cursor.execute ('CREATE TABLE challenge (integers integer)')
cursor.executemany('INSERT INTO challenge VALUES (?)',seq_list)
conn.commit()
cursor.execute('''SELECT integers,COUNT(1) as integers FROM challenge GROUP BY integers
ORDER BY COUNT(1) DESC LIMIT 1''')
fetch=cursor.fetchone()
cursor.execute('DROP TABLE challenge')
conn.commit()
conn.close()
if fetch==None:
return None
elif fetch[1]>((target_length/2)-1):
return fetch[0]
else:
return None
def main():
parser=argparse.ArgumentParser()
parser.add_argument('filename',type=str)
args=parser.parse_args()
try:
sequences=open(args.filename,'r').read().rstrip().split('\n')
except:
print 'None'
p=Pool(1)
majors=p.map(dowork,sequences)
for major in majors:
print major
dowork(sequences[0])
if __name__=="__main__":
main()
▶ No.809826
>>809825
>since the number is guaranteed to be in 50%+1 of that sequence.
Stop saying that. How many ways are there to select HALF of the numbers?
▶ No.809827
>>809825
i need a drink, still fucked up the post, delete
dowork(sequences[0]) in main(). that was just there for testing.
▶ No.809829>>809830 >>809856
>>809825
> seq_list=sequence.split(',')
> target_length=(len(seq_list)/2)
> #drop half of the sequence, unnecessary
> seq_list=seq_list[:target_length]
▶ No.809830>>809831 >>809832
>>809829
that is a very happy looking nigger. what is he looking at?
▶ No.809831
>>809830
You're in for a treat. Just watch the first minute.
https://www.youtube.com/watch?v=WaM47Otif04
▶ No.809832
▶ No.809835
>>809825
sergio, el scripto no werko
▶ No.809836>>809838 >>809885
Noob here, why is the C++ solution more concise than the C one but still rather efficient? Isn't C++ shit?
▶ No.809838>>809841
>>809836
>why do I need less code when I use higher-level code
because you're getting more done per line, duh. the parse_n in the cheater threads code would not even take up a single line in a scripting language; it'd be one subexpression of a line, like string.gmatch in the lua:
print(major(string.gmatch(line, "%d+")) or "None")
language shitness isn't just about code terseness, or we would also shill APL
▶ No.809841>>809842 >>809845
>>809838
But people here say C is the best and shit like C++ and Java are bloated heaps of trash.
▶ No.809842>>809857
>>809841
i'm waiting for someone to post the x86 assembly solution.
▶ No.809845>>809858
>>809841
again: is APL the best? There's more to language shitness than tersity. Tersity doesn't even reliably translate to humans spending less time on writing/reading/troubleshooting that span of code.
▶ No.809856>>809886 >>810015
>>809825
>>809829
>the improvement here is get rid of half of the sequence immediately
The maximum will occur in some half+1 length subsequence, but you made the false assumption that the subsequence necessarily starts at the beginning. Imagine the subsequence as a sliding window, and you'll see why that assumption is wrong.
See gif related.
▶ No.809857
>>809842
>not x86_64
Not wonder why you arent Gods chosen high priest
▶ No.809858>>809860
>>809845
/tech/ said C++ is shit so i refuse to believe it can be as good as gods language also known as C.
▶ No.809860>>809864
>>809858
ok mang. You saw through my attempts at deflection. Here's the truth: C is God's language and C++ is complete shit. C++'s shitness does not extend to it not having a number of conveniences that show off in toy code like this. The devil's succubi do not appear as ugly hags, but beautiful maidens. Similarly C++ has superficial benefits that lure you into letting it waste tremendous amounts of brainspace on it. C++ is the devil's own attempt at a DOS attack on your brain. Though warlocks and sinners believe that they can deal with the devil, extract some narrow advantage ("I'll just use Qt for the GUI"), and then flee, they invariably fall to hell as well, and become mad and incapable of safe coding. Although some will claim to have successful careers, this is a combination of survivor's bias and the AIDS-patient mental weakness of such people not coming across in text.
Save yourself.
▶ No.809861>>809866 >>811099 >>811220
>>808650 (OP)
package main
import (
"bufio"
"fmt"
"os"
"strconv"
"strings"
)
func process(filename string) {
f, err := os.Open(filename)
if err != nil {
panic(err)
}
defer f.Close()
s := bufio.NewScanner(f)
for s.Scan() {
L := 0
hist := make([]int, 101)
numStrs := strings.Split(s.Text(), ",")
for _, numStr := range numStrs {
num, err := strconv.Atoi(numStr)
if err != nil {
panic(err)
}
if num < 0 || num > 100 {
panic("number out of range")
}
L++
hist[num]++
}
func() {
for num, count := range hist {
if count > L/2 {
fmt.Println(num)
return
}
}
fmt.Println("None")
}()
}
}
func main() {
if len(os.Args) != 2 {
fmt.Printf("usage: %s [input]\n", os.Args[0])
return
}
process(os.Args[1])
}
Go
am grill
▶ No.809864
>>809860
>devil's own attempt at a DOS attack on your brain
kek. Reminds me of my dance with the devil called Haskell. After reading a paper on how the type system can solve a rubiks cube, I really questioned what I was doing with my life.
▶ No.809869
>>809866
^_^
could have brought memory usage down a bit by reusing hist instead of making a new one, but since Go binaries are usually several megabytes by themselves, there's no point.
▶ No.809872>>809877 >>809878 >>809888
Current powerranking:
1
C/C++ (the site actually gave the C++ one a higher rating even thogh it's somewhat slower)
2
Python
3
Go
?
[Rust] (no metrics exist for the rust program so not sure where it ranks)
▶ No.809877
>>809872
>no metrics for the rust program
because the site doesn't support it? what languages does it support then?
▶ No.809878>>809919 >>810458
>>809872
You have to use these languages I believe:
> Python 2.7.6
> C 4.8.4
> C++ 4.8.4
> Java 1.8.0
> Ruby 2.3.0
> Perl 5.18.2
> PHP 5.5.9
> Tcl 8.6
> Clojure 1.7.0
> JavaScript 5.3.0
> Scala 2.9.2
> C# 4.0
> Objective-C 4.8.4
> Python 3 3.4.3
> Haskell 7.6.3
> Go 1.5.2
> Bash 4.3.11
> Lua 5.2.3
> R 3.2.3
> Visual Basic.NET 4.0
> D 4.8.4
> Fortran 4.8.4
> Guile 2.0.9
> OCaml 4.01.0
> Scheme
No Erlang, Ada, Rust, APL, J... total pleb tier.
▶ No.809885
>>809836
The goal was to be fast, not concise. The C solutions could be made concise.
▶ No.809886>>810015
>>809856
Ahh, just occurred to me that even this is wrong too. It's relying on the assumption of a continuous subsequence... and why should it be? The number of ways of selecting half+1 from even just 100 is astronomical. Trying to skip inputs is a pure dead end.
▶ No.809888>>809891 >>809921
>>809872
>the site actually gave the C++ one a higher rating even thogh it's somewhat slower
Jesus, this site.. What metric do you think it's weighing higher?
▶ No.809891
>>809888
their signups are also broken. I never got the confirmation email. searched and it's a known problem.
▶ No.809918
#!/bin/bash
declare -a numbers
while read line
do
IFS=', ' read -r -a array <<< "$line"
for element in "${array[@]}"
do
numbers[$element]=$(( numbers[$element] + 1 ))
done
max=0
for i in "${!numbers[@]}"
do
(( numbers[$i] > numbers[$max] )) && max=$i
done
if (( numbers[$max] > ${#numbers[@]} / 2 ))
then
echo "$max"
else
echo "None"
fi
for i in "${!numbers[@]}"
do
numbers[$i]=0
done
done < "$1"
▶ No.809919
>>809878
>gcc 4.8.4
>python3 3.4.3
>GHC 7.6.3
Is that fucking debian oldoldstable?
▶ No.809921>>809922 >>810067
>>809888
CPU time does matter but after a certain limit it doesn't net you any more points. Low memory gives a lot of points.
▶ No.809922
>>809921
I wonder if it'd count the shared mmap version as the full file size even though it only actually needs 4k of ram. I'lll send you that version when I get home.
▶ No.809928>>809930 >>809973 >>810181
Step aside, Pajeet!
import Control.Arrow
import Data.List
import Data.List.Split
import Data.Maybe
import System.Environment
main = getArgs >>= readFile . head >>= putStr . unlines . map majorElement . lines
majorElement = maybe "None" snd . listToMaybe . uncurry moreThanHalf . (length &&& count) . splitOn ","
count = map (length &&& head) . group . sort
moreThanHalf l = filter ((> l `div` 2) . fst)
▶ No.809973>>809998
>>809928
>>809928
>bing uncurry
>webpage is just (a -> b -> c) -> ((a, b) -> c)
I don't have enough autism for haskell.
▶ No.809998>>810000
>>809973
Imagine you have a machine, that when you feed a coin of type A, it outputs another machine that takes coins of type B. When you give that machine a B, it gives you a C.
What uncurry does, it wraps those details up, and says, if you give me an A and B right now, I'll give you a C.
▶ No.810000
>>809998
Actually I missed one step at the end, it's a little more subtle. Uncurry is a machine that takes the machine I first described, and gives you a machine that takes an A and B at the same time. Of course, once you subsequently give that machine an A and B, you get a C.
▶ No.810015>>810021
>>809886
>>809856
>>809825
fuck your right, this is solution isn't going to work at all. it only works on the test cases because the major numbers happen to be in > 50% of the first half of the sequence but it doesn't have to be. i thought you could drop every other number instead for a second or something to fix it but that's the exact same problem.
▶ No.810021
>>810015
No worries, as >>809628 put it, we all have these moments. The difference is being autistic enough to care in the first place in rigorously attacking the problem. A true Pajeet can shit in his code base like this >>809051 and be proud of it.
▶ No.810038>>810043 >>810087 >>810168
ye olde sepples
#include <iostream>
#include <fstream>
#include <string>
template<size_t N>
inline uint16_t count_each(std::string line, uint16_t(& count_of)[N])
{
uint8_t curval = 0;
uint16_t n_numbers = 1;
for (size_t i = 0, sz = line.size(); i < sz; ++i){
char c = line[i];
if (c == ','){
++count_of[curval];
curval = 0;
++n_numbers;
continue;
}
curval = curval * 10 + (c - '0');
}
++count_of[curval];
return n_numbers;
}
template<size_t N>
inline size_t argL2(uint16_t(& count_of)[N], uint16_t l2)
{
for (size_t i = 0; i < N; ++i)
if (count_of[i] > l2)
return i;
return N;
}
inline void major(const char* filename)
{
std::ifstream file(filename);
for (std::string line;;){
getline(file, line);
if (!file) return;
uint16_t count_of[101] = {};
const uint16_t sz = count_each(std::move(line), count_of);
const size_t res = argL2(count_of, sz / 2);
if (res == sizeof(count_of) / sizeof(count_of[0]))
std::cout << "None\n";
else
std::cout << res << '\n';
}
}
int main(int argc, char* argv[])
{
major(argv[1]);
}
▶ No.810043>>810054
>>810038
a simple for loop that reads integers into an array while ifsteam object != ‘,’ would be more succinct.
▶ No.810054
>>810043
True, but I was more trying to limit the number of ifstream calls for efficiency.
Don't know how much more efficient it actually is, though :^)
▶ No.810067>>810182 >>811220
>>809921
This one should show up as low memory as the buffer is only necessarily 4k, but the site might be shit and count the whole map as memory. If it's fucked like that, I can change it in another way to get it to show low memory. This is just my single-threaded one using mmap() to see if it changes the ranking.
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/mman.h>
#include <sched.h>
#include <errno.h>
#include <stdio.h>
#include <pthread.h>
#include <stdint.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define MAX_L 30000
#define SEPARATOR_LENGTH 1
#define EOL_LENGTH 1
#define MAX_N_LENGTH 1
#define MAX_CASES 40
#define MAX_LINE_LENGTH ((SEPARATOR_LENGTH * (MAX_L - 1)) + (MAX_N_LENGTH * MAX_L) + EOL_LENGTH)
#define MAX_FILE_LENGTH MAX_LINE_LENGTH * MAX_CASES
char *buffer;
uint_fast64_t counts[100];
uint_fast64_t const tens[] = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90};
inline uint_fast64_t parse_n(char * restrict * restrict c) {
uint_fast64_t retval = 0;
char *begin = *c;
while(**c >= '0' && **c <= '9') {
++*c;
}
int len = *c - begin;
switch(len) {
case 3: retval = 100; break;
case 2: retval = tens[begin[len - 2] - '0'];
case 1: retval += begin[len - 1] - '0';
}
return retval;
}
int main(int argc, char *argv[]) {
struct stat stat;
uint_fast64_t n;
uint_fast16_t l;
int i;
int fd = open(argv[1], O_RDONLY);
fstat(fd, &stat);
buffer = mmap(NULL, stat.st_size, PROT_READ, MAP_SHARED, fd, 0);
char *c = buffer;
goto start;
for(; *c; ++c) {
(void) memset(counts, 0, sizeof(counts));
start:
for(l = 1; ; ++l, ++c) {
n = parse_n(&c);
++counts[n];
if(*c == '\n') break;
}
uint_fast16_t half = l >> 1;
for(i = 0; i < 100; ++i) {
if(counts[i] > half) {
printf("%d\n", i);
break;
}
}
if(i == 100) printf("None\n");
}
}
▶ No.810068>>810171
Nice problem.
[spoiler]1. Use a linear time selection algorithm to find the median.
2. The median doesn't necessarily appear >L/2 times, so count the number of occurrences.
3. If it does then return the median, else return None.[/spoiler]
▶ No.810087>>810168 >>810182
Exact same as this: >>810038 , but multithreaded:
#include <iostream>
#include <fstream>
#include <string>
#include <thread>
#include <vector>
template<size_t N>
inline uint16_t count_each(std::string line, uint16_t(& count_of)[N])
{
uint8_t curval = 0;
uint16_t n_numbers = 1;
for (size_t i = 0, sz = line.size(); i < sz; ++i){
char c = line[i];
if (c == ','){
++count_of[curval];
curval = 0;
++n_numbers;
continue;
}
curval = curval * 10 + (c - '0');
}
++count_of[curval];
return n_numbers;
}
template<size_t N>
inline size_t argL2(uint16_t(& count_of)[N], uint16_t l2)
{
for (size_t i = 0; i < N; ++i)
if (count_of[i] > l2)
return i;
return N;
}
inline void major(const char* filename)
{
std::vector<std::string> lines; lines.reserve(40);
std::ifstream file(filename);
for (std::string line;;){
getline(file, line);
if (!file) break;
lines.push_back(std::move(line));
}
size_t n_threads = std::thread::hardware_concurrency();
n_threads = n_threads == 0 ? 1 : n_threads;
n_threads = n_threads > lines.size() ? lines.size() : n_threads;
std::vector<std::thread> threads(n_threads);
size_t interval = lines.size() / n_threads;
for (size_t i = 0, start = 0; i < n_threads; ++i, start += interval){
threads[i] = std::thread([&lines, start, finish(i < n_threads - 1 ? start + interval : lines.size())]{
for (size_t i = start; i < finish; ++i){
uint16_t count_of[101] = {};
const uint16_t sz = count_each(std::move(lines[i]), count_of);
size_t res = argL2(count_of, sz / 2);
lines[i] = res == sizeof(count_of) / sizeof(count_of[0]) ? "None" : std::to_string(res);
}
});
}
for (std::thread& thr : threads)
thr.join();
for (const std::string& res : lines)
std::cout << res << '\n';
}
int main(int argc, char* argv[])
{
major(argv[1]);
}
▶ No.810094>>810097 >>810100 >>810132 >>810133 >>810137
>>809592
python2.7 solution that's more similar to the C solutions yet still slow as fuck, but faster than the sqlite solution. 1.5s vs 1.8-2s with 8 processes, 40 x 50k lines (i know it's 30k but for similar timing to the sqlite solution tests).
#!/bin/python2.7
import argparse
from multiprocessing import Pool
import cStringIO
def dowork(sequence):
input_stream=cStringIO.StringIO(sequence)
count_list=[0]*50001
num=''
total_num=0
next=input_stream.read(1)
while (next!=''):
if next!=',':
num+=next
else:
int_num=int(num)
count_list[int_num]+=1
total_num+=1
num=''
next=input_stream.read(1)
target_num=int(total_num)/2
for index,count in enumerate(count_list):
if count>=target_num:
return index
return None
def main():
parser=argparse.ArgumentParser()
parser.add_argument('filename',type=str)
args=parser.parse_args()
try:
sequences=open(args.filename,'r').read().rstrip().split('\n')
except:
print 'None'
p=Pool(8)
majors=p.map(dowork,sequences)
for major in majors:
print major
if __name__=="__main__":
main()
▶ No.810097
>>810094
change count_list[0]*50001 to
count_list[0]*30001
▶ No.810100>>810102
>>810094
looking at it now this is a stupid way of doing it i should be counting numbers 0-100 not every one of the 30k integers.
▶ No.810102
>>810100
which is exactly what its doing, fml, just change count_list[0]*50000 to count_list[0]*101, which is all it's using of that list anyway
performance now 1.3s compared to 1.5
▶ No.810132>>810182 >>811220
>>810094
>ugh all this disgusting python
have some Serious Perl:
#! /usr/bin/env perl
use common::sense;
LINE: while (defined(my $line = <>)) {
my %counts;
my @numbers = $line =~ /(\d+)/g;
for (@numbers) {
if (++$counts{$_} > @numbers/2) {
print $_, "\n";
next LINE;
}
}
print "None\n";
}
▶ No.810133>>810221
improved version that only parses the data once.
>>810094
reads the data twice, the first time to split on newlines and then again to parse the sequences. this version does it all at once. It still reads the file into a string first before parsing, reading the file directly during parsing was 20% slower than using cstring.
it performs about the same single threaded and worse multi-threaded, because there's less to multithread.
#!/bin/python2.7
import argparse
from multiprocessing import Pool
import cStringIO
def parse(sequences):
input_stream=cStringIO.StringIO(sequences)
count_list_list=[]
next=input_stream.read(1)
while (next!=''):
count_list=[0]*101
total_num=0
num=''
while(next!='\n' and next!=''):
if next!=',':
num+=next
else:
int_num=int(num)
count_list[int_num]+=1
total_num+=1
num=''
next=input_stream.read(1)
target_num=int(total_num)/2
count_list_list.append((count_list,target_num))
next=input_stream.read(1)
return count_list_list
def findMajor( (count_list,target_num) ):
for index,count in enumerate(count_list):
if count>=target_num:
return index
return None
def main():
parser=argparse.ArgumentParser()
parser.add_argument('filename',type=str)
args=parser.parse_args()
try:
sequences=open(args.filename,'r').read()
except:
print 'None'
count_list_list=parse(sequences)
p=Pool(1)
majors=p.map(findMajor,count_list_list)
for major in majors:
print major
if __name__=="__main__":
main()
▶ No.810137
>>810094
also
split('\n) on a string is about 100 times slower than
readlines() on a file, but it only happens once it's the difference between 10ms and .01ms.
▶ No.810168
>>810087
>>810038
i think the first c++ solution is more readable/simpler than yours
▶ No.810171
▶ No.810181
Codegolf Perl ported to work on shit site
use List::Util(first);while(<>){chomp;@a=split/,/;%c={};$c{$_}++for@a;print(((first{$c{$_}>~~@a/2}@a)||None).$/)}
>>809928
This is the first time I've seen Control.Arrow used in actual code
▶ No.810182
▶ No.810183>>810184 >>810189 >>810201 >>810347
Okay here is a chart of the current standings, including the rating given by the pajeet site.
(i am not sure if there is only one python2 anon here or if its more than one, please let me know if i missed someones program)
author Language Time, ms Memory, bytes Ranking Points
808872 c 3 253952 34.783
809731 c++ 12 8760 34.972
810038 c++ 2 706153 34.407
809861 go 933 9961472 25.055
809276 py3 195 5029888 30.461
810133 py2 446 5069896 29.989
810181 perl 128 5988301 29.779
▶ No.810184
>>810183
Forgot to mention for this challenge the max score is 35 but the max is usually unobtainable
▶ No.810187>>810191 >>810192
>>809703
best python solution
#!/bin/python2.7
import argparse
import subprocess
import zlib
import os
#thanks internet!
p_string='''x\x9c\x8dR\xcbn\x830\x10\xbc\xfb+\xb6\xc9!\x90\xa06\xf4\xd0\x03\xaf/@\xfc@\x8a"\x02v\xb2\n\x98\nL\xab6\xe2\xdf\xebG\xd2\x84@\xa4\xee\xc5\xf6\xeexvfm2G\x9e\x97]A!\xc0\xba\x15\r\xcd\xaa\x88\\sl\x9c\xcae\n\xf9>"d^P\x86\x9cB\x02\xee\xda%\x04\xb9\x80*Cn\xa9M\xd6\xecs\x07\xf2C\xd6,\x97\xea\xf0i\xc3\x89\x80\x8cNV\xb7,k\x85\xfb\xb6\x15\x90\xd7\x1d\x17\x9b$\xf5u\xad\x15\x85\xe7U\xb4j\xa9\xb0t\xc5\x81\xb5\x03-\xfe\xd0\x9a\x19\xa8\xed\x13\x8dD\x06\x96F\xe3Y! gX\xd2\x93j\xb5qS\xc7pIK\x9e\xb7C\x9e5\xdf\xfdE\xc0X\x04\x87\x10\xd6\xfe\x83b<,~\x1dd\x17\xb0L7\x88"\xe0\xb7\xbc*V+c\x8a\xa7\xfe]>\x1e&\x94\x07\xc3\xf3\xfcA\xe9\xd1\xb2!\x0ca\xe1,\xee\t5\xd6\xe0p\xcf\xeb\x86Z\xae=d\xea\x07\'Z\xb6t\x82"\x86\x97\x10^\xcf\xf3\xbb\x8d]]\x97\xc0\xa4\xe6BZe\x99\xbc\xed\x8f0\xacn\xe4\xc0\xe5C\xc8\x89\xa0\x9e\x88\\\x02H|i\x0b\xa7\x04_\x0c\x9aY`\nQ\x08\xf1#\xa0\n\xfd^\x12- \x08\x14w\x00\xb3w>\x1b+\xb9*2\x82E\xd3M\xe8\xfd\xf3&\xff\xc6q\xba\xdc\x8f\xb2\xe3\x8cr\xf0\xa4;=\x92>\x90=KjN\'U\xf7\xe3\xb1\xff\xeb\xb3\xdf_\xba\xfb\x8dC\xd5f\xd7\x93\x9e\xfc\x02\xddQ\xe8$'''
def main():
parser=argparse.ArgumentParser()
parser.add_argument('filename',type=str)
args=parser.parse_args()
p=subprocess.Popen(['g++','-xc++','-O2','-'],stdout=subprocess.PIPE,
stdin=subprocess.PIPE,stderr=subprocess.STDOUT,
universal_newlines=True)
p.communicate(zlib.decompress(p_string))
try:
print subprocess.check_output(os.getcwd()+'/a.out '+args.filename,shell=True)
except subprocess.CalledProcessError as result:
print result.output
if __name__=="__main__":
main()
▶ No.810189
>>810183
unless i missed something i think i'm the only one posting python2.7
▶ No.810191
>>810187
>#thanks internet!
that thread is pure gold
▶ No.810192
>>810187
i made a binary version of this but the string is too long to post.
▶ No.810201>>811220
>>810183
Golf Perl with an array instead of hashes:
use List::Util(first);@c=(0)x101;while(<>){chomp;@a=split/,/;$l=0;for(@a){$c[$_]++;$l++};print(((first{$c[$_]>~~$l/2}@a)||None).$/);$c[$_]=0 for 0..$#c}
Javashit:
var readline = require("readline");
var fs = require("fs");
var rl = readline.createInterface({
input: fs.createReadStream(process.argv[2])
});
rl.on("line", ln => {
var cnt = new Array(101);
cnt.fill(0);
var l = 0;
ln.split(',').forEach(num => {
cnt[parseInt(num)] += 1;
l += 1;
});
var half = (l / 2)|0;
var res = cnt.findIndex(x => x > half);
console.log(res < 0 ? "None" : res);
});
▶ No.810203>>810208 >>810367
time to kill yourself, queen.
extern crate memchr;
extern crate memmap;
use std::thread::{self};
use memchr::{memchr};
const MAX_TESTS: usize = 40;
struct Results {
array: [u8; MAX_TESTS],
index: usize
}
impl Results {
fn new() -> Self {
Results {
array: [0; MAX_TESTS],
index: 0
}
}
fn push(&mut self, result: Option<usize>) {
self.array[self.index] = result.map_or(1, |r| r as u8 + 2);
self.index += 1;
}
fn extend(&mut self, results: &Results) {
self.array[self.index..][..results.index].copy_from_slice(&results.array[..results.index]);
self.index += results.index;
}
}
fn main() {
let mut args = std::env::args_os().skip(1);
assert_eq!(args.len(), 1);
let mmap = memmap::Mmap::open_path(args.next().unwrap(), memmap::Protection::Read).unwrap();
let slice = unsafe { mmap.as_slice() };
// let results = work(slice);
let results = split(slice, 2);
for &r in &results.array[..results.index] {
match r {
0 => { unreachable!(); }
1 => { println!("None"); }
_ => { println!("{}", r - 2); }
}
}
}
fn split(slice: &[u8], depth: usize) -> Results {
if slice.is_empty() {
Results::new()
} else if depth == 0 {
work(slice)
} else if let Some(i) = memchr(b'\n', &slice[slice.len() / 2..]) {
let i = slice.len() / 2 + i;
if i == slice.len() {
work(slice)
} else {
let (sl, sr) = unsafe { std::mem::transmute::<(&[u8], &[u8]), (&'static [u8], &'static [u8])>(slice.split_at(i + 1)) };
let tl = thread::spawn(move || split(sl, depth - 1));
let tr = thread::spawn(move || split(sr, depth - 1));
let mut results = tl.join().unwrap();
results.extend(&tr.join().unwrap());
results
}
} else {
work(slice)
}
}
fn work(slice: &[u8]) -> Results {
fn linefeed(results: &mut Results, count: &mut u8, counts: &mut [u8; 101], n: usize) {
counts[n] += 1;
results.push(counts
.iter()
.cloned()
.enumerate()
.filter(|&(_, v)| v > (*count + 1) / 2)
.next()
.map(|(i, _)| i)
);
*count = 0;
*counts = [0; 101];
}
let mut iter = slice.iter().cloned();
let mut results = Results::new();
let mut count = 0;
let mut counts = [0; 101];
while let Some(b) = iter.next() {
let mut n = (b - b'0') as usize;
match iter.next() {
Some(b',') => { count += 1; counts[n] += 1; }
Some(b'\n') | None => { linefeed(&mut results, &mut count, &mut counts, n); }
Some(b) => {
const LOOKUP: [usize; 10] = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90];
n = LOOKUP[n] + (b - b'0') as usize;
match iter.next() {
Some(b',') => { count += 1; counts[n] += 1; }
Some(b'\n') | None => { linefeed(&mut results, &mut count, &mut counts, n); }
_ => {
match iter.next() {
Some(b',') => { count += 1; counts[100] += 1; }
Some(b'\n') | None => { linefeed(&mut results, &mut count, &mut counts, 100); }
_ => { unreachable!() }
}
}
}
}
}
}
results
}
▶ No.810208>>810209
>>810203
>impl Results {
gcc shit.c
shit.c:1:8: error: unknown type name ‘crate’
extern crate memchr;
^~~~~
shit.c:1:14: warning: built-in function ‘memchr’ declared as non-function
extern crate memchr;
^~~~~~
shit.c:2:8: error: unknown type name ‘crate’
extern crate memmap;
...
▶ No.810209>>810214
>>810208
>gcc
nigger, are you trying to compile rust code with gcc?
▶ No.810214>>810215
>>810209
what is rust some kind of meme library for c++? maybe i should try g++
▶ No.810215
▶ No.810221>>810682 >>810729
>>810133
additional 30% speed improvement.
now with fastInt(tm) int lookup table instead of int cast
1.5-1.8s -> 1.1-1.25s single threaded, on my test data.
#!/bin/python2.7
import argparse
from multiprocessing import Pool
import cStringIO
int_lookup=None
def generateFastInt():
global int_lookup
int_lookup={}
for i in range(0,101):
int_lookup[str(i)]=i
def parse(sequences):
global int_lookup
input_stream=cStringIO.StringIO(sequences)
count_list_list=[]
next=input_stream.read(1)
while (next!=''):
count_list=[0]*101
total_num=0
num=''
while(next!='\n' and next!=''):
if next!=',':
num+=next
else:
count_list[int_lookup[num]]+=1
total_num+=1
num=''
next=input_stream.read(1)
target_num=int(total_num)/2
count_list_list.append((count_list,target_num))
next=input_stream.read(1)
return count_list_list
def findMajor( (count_list,target_num) ):
for index,count in enumerate(count_list):
if count>=target_num:
return index
return None
def main():
generateFastInt()
parser=argparse.ArgumentParser()
parser.add_argument('filename',type=str)
args=parser.parse_args()
try:
sequences=open(args.filename,'r').read()
except:
print 'None'
count_list_list=parse(sequences)
p=Pool(1)
majors=p.map(findMajor,count_list_list)
for major in majors:
print major
if __name__=="__main__":
main()
▶ No.810239>>810241 >>810317 >>810929
Updated version
The new perl version performed worse than the old one so i left the old one in.
author Language Time, ms Memory, bytes Ranking Points out of 35
808872 c 3 253952 34.783
809731 c++ 12 8760 34.972
810038 c++ 2 706153 34.407
809861 go 933 9961472 25.055
809276 py3 195 5029888 30.461
810221 py2 340 5091500 30.156
810181 perl 128 5988301 29.779
810201 js 179 315392 34.424
▶ No.810241>>810243 >>810359 >>810360
>>810239
>he didnt include my rust version which ironically is the fastest by far
lol
▶ No.810243
>>810241
Queen paid me off so i would ignore Rust solutions.
▶ No.810262>>810319 >>810448 >>810922 >>810929
>C-s lisp C-m
>nothing found
>8ch in 2017
(defun main (pathname)
(labels ((main-loop (list result)
(cond ((null list) (nreverse result))
(T (main-loop (cdr list)
(cons (analyse-list (car list)) result))))))
(main-loop (file-to-list pathname) nil)))
(defun analyse-list (list)
"Give out the element that appear more than L/2 times."
(labels ((analyse-list (list cnt)
(cond ((= cnt (length list)) (format t "~&None."))
((> (num-appearance (nth cnt list) list)
(/ (length list) 2))
(format t "~&~S" (nth cnt list)))
(T (analyse-list list (1+ cnt))))))
(analyse-list list 0)))
(defun num-appearance (number list &optional (appearances 0))
"How much time a number appear in a given list"
(cond ((null list) appearances)
((= (car list) number)
(num-appearance number (cdr list) (1+ appearances)))
(T (num-appearance number (cdr list) appearances))))
(defun file-to-list (pathname)
"Make a list from a file"
(with-open-file (s pathname)
(labels ((file-to-list (pathname reslist)
(let ((line (read-line s nil :eof)))
(if (equalp line :eof)
(nreverse reslist)
(file-to-list pathname
(cons (line-to-list line) reslist))))))
(file-to-list pathname nil))))
(defun line-to-list (string)
"Make a list from a string formed as num,num,num with 0 < num < 100"
(labels ((line-to-list (string length cnt-start cnt reslist)
(cond ((= length (- cnt 1)) (nreverse reslist))
((or (= length cnt)
(string= (aref string cnt) #\,))
(if (= (- cnt cnt-start) 2)
(line-to-list string length (1+ cnt) (1+ cnt)
(cons (+ (* 10 (digit-char-p (aref string cnt-start)))
(digit-char-p (aref string (1+ cnt-start))))
reslist))
(line-to-list string length (1+ cnt) (1+ cnt)
(cons (digit-char-p (aref string cnt-start))
reslist))))
(T (line-to-list string length cnt-start (1+ cnt) reslist)))))
(line-to-list string (length string) 0 0 nil)))
Enjoy shit recursion faggot :^)
▶ No.810275>>810729
trying to increase this python performance I found numba which i haven't heard before for some reason
https://numba.pydata.org/
it apparantly compiles python code in real time or something, I've been trying to get it working and was successful to some extent, but trying to deal with it is such a massive pain in the ass that I might as well just write the shit in C.
it's an interesting library anyway. it works well if your doing math in loops, but you have to play all kinds of games to get strings, or character arrays because strings aren't supported at all, doing anything but throwing errors.
▶ No.810317>>810318 >>810332 >>810349 >>810381
>>810239
Try
use List::Util(first);@c=(0)x101;while(<>){chomp;@a=split/,/;$c[$_]++for(@a);$h=(~~@a)/2;print(((first{$c[$_]>$h}@a)||None).$/);$c[$_]=0 for 0..$#c}
and
use List::Util(first);@c=(0)x101;while(<>){chomp;@a=split/,/;$c[$_]++for(@a);$h=(~~@a)/2;print(((first{$c[$_]>$h}@a)||None).$/);@c=(0)x101}
▶ No.810318
>>810317
dont you know what linebreaks are?
▶ No.810319
>>810262
How do I into this level of lisp? It's always been a mystery to me.
t.imperative pleb
▶ No.810332
>>810317
this code doesn't look like it respects my freedoms
▶ No.810347>>810351 >>810358
>>810183
Oh now that is some fucking bullshit. So as usual, it comes down to what compiler they're using with what version on what platform. So here we go, micro-optimization time:
#include <stdint.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>
#define MAX_L 30000
#define MAX_CASES 40
static char *buffer;
static uint_fast16_t counts[101];
static inline uint_fast16_t parse_n(char **c) {
uint_fast16_t retval = 0;
char *begin = *c;
char *end = begin;
while(*end & 0x10) ++end;
int len = end - begin;
if(len > 2) retval = 100;
else if(len <= 1) retval = begin[0] - '0';
else retval = (begin[0] - '0') * 10 + begin[1] - '0';
*c = end;
return retval;
}
int main(int argc, char *argv[]) {
struct stat stat;
uint_fast16_t n;
uint_fast16_t l;
int i;
int fd = open(argv[1], O_RDONLY);
fstat(fd, &stat);
buffer = (char *) malloc(stat.st_size);
read(fd, buffer, stat.st_size);
char *c = buffer;
goto start;
for(; *c; ++c) {
(void) memset(counts, 0, sizeof(counts));
start:
for(l = 1; ; ++l, ++c) {
n = parse_n(&c);
++counts[n];
if(*c == '\n') break;
}
uint_fast16_t half = l >> 1;
for(i = 0; i < sizeof(counts) / sizeof(counts[0]); ++i) {
if(counts[i] > half) {
printf("%d\n", i);
break;
}
}
if(i == sizeof(counts) / sizeof(counts[0])) printf("None\n");
}
}
I've also left out the mmap() since they've broken/disabled it somehow, which is fucking garbage as that would only use 4k of memory while having next to no overhead. I'll make another post in a sec re getting buttfucked by gcc as it's kinda interesting.
▶ No.810351>>810357 >>810381 >>810384
>>810347
I should have had coffee, first. Use this one instead, the other one potentially overruns. The rustfag will love that.
#include <stdint.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>
#define MAX_L 30000
#define MAX_CASES 40
static char *buffer;
static uint_fast16_t counts[101];
static inline uint_fast16_t parse_n(char **c) {
uint_fast16_t retval = 0;
char *begin = *c;
char *end = begin;
while(*end & 0x10) ++end;
int len = end - begin;
if(len > 2) retval = 100;
else if(len <= 1) retval = begin[0] - '0';
else retval = (begin[0] - '0') * 10 + begin[1] - '0';
*c = end;
return retval;
}
int main(int argc, char *argv[]) {
struct stat stat;
uint_fast16_t n;
uint_fast16_t l;
int i;
int fd = open(argv[1], O_RDONLY);
fstat(fd, &stat);
buffer = (char *) malloc(stat.st_size + 1);
read(fd, buffer, stat.st_size);
buffer[stat.st_size] = 0;
char *c = buffer;
goto start;
for(; *c; ++c) {
(void) memset(counts, 0, sizeof(counts));
start:
for(l = 1; ; ++l, ++c) {
n = parse_n(&c);
++counts[n];
if(*c == '\n') break;
}
uint_fast16_t half = l >> 1;
for(i = 0; i < sizeof(counts) / sizeof(counts[0]); ++i) {
if(counts[i] > half) {
printf("%d\n", i);
break;
}
}
if(i == sizeof(counts) / sizeof(counts[0])) printf("None\n");
}
}
▶ No.810357
>>810351
>overruns
queen btfo lmao
▶ No.810358>>810365 >>810416
>>810347
So in my code above, you might notice the "while(*end & 0x10) ++end;". What it actually does doesn't matter (it's skipping over digits and stopping on ',' and '\n'), but how gcc and clang handle it does. Here's a test function with it:
char *foo(char *c) {
while(*c & 0x10) ++c;
return c;
}
Now the assembly I expect to get from this should be a pretty tight loop, but look at the differences between gcc and clang:
gcc:
foo:
.LFB0:
.cfi_startproc
testb $16, (%rdi)
movq %rdi, %rax
je .L2
.p2align 4,,10
.p2align 3
.L3:
addq $1, %rax
testb $16, (%rax)
jne .L3
.L2:
rep ret
clang:
foo: # @foo
.cfi_startproc
# BB#0:
decq %rdi
.align 16, 0x90
.LBB0_1: # =>This Inner Loop Header: Depth=1
testb $16, 1(%rdi)
leaq 1(%rdi), %rdi
jne .LBB0_1
# BB#2:
movq %rdi, %rax
retq
clang completely neglects speculative execution and does a pre-decrement to handle the special case of the loop being length zero. Since that case never happens yet it pays the cost for it in every case it runs twice as slow as gcc. Nothing I tried got clang to not do stupid shit with that case, so if this runs on clang it will be a heavy penalty to the tightest loop. I'll probably look into it more tonight and see if the loop can be rewritten to exclude the case of a zero length value.
▶ No.810359>>810374
>>810241
Rust + threads vs C is kinda bullshit since the test site isn't allowing C to use threads. I might try calling clone() via __syscall() later but I don't know if I want to put that amount of effort into it.
▶ No.810365>>810369
>>810358
i don't know how trivial a task it would be, but you can obviously read the assembly, can't you just inline it in C, especially since it's that small?
▶ No.810367>>810370 >>810380
>>810203
This one doesn't seem to work. Or I'm not compiling your faggy language correctly. The other rust one worked with the same method, though.
root@sid:~/src/foo# target/release/foo ../test.txt
0
1
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
23
1
1
1
0
0
0
0
0
0
0
3
0
27
62
0
0
3
0
2
0
2
▶ No.810369
>>810365
Yeah, but inline asm really isn't a good idea. Modern optimization is about tricking the compiler into giving you the assembly you wanted, as then it's still as portable as C but as good as hand-written assembly. gcc/clang also have bultins for things you can't get out of C, like popcount, which are also portable (a C replacement is transparently used on architectures that don't have it). I almost never do real assembly these days, and only in the kernel.
Btw, as example of how bad compilers are, if you use __builtin_expect() to tell it the length 2 case is the most frequent, it actually compiles it worse.
▶ No.810370>>810372 >>810376
>>810367
>doesnt seem to work
but according to your output it seems to work fine???
▶ No.810372>>810375
>>810370
Those aren't the correct results.
▶ No.810374
>>810359
there is actually a line commented out in the rust version. if you uncomment that and comment the next line, it will be single threaded
▶ No.810375>>810377 >>810378
>>810372
then your test data is bad
▶ No.810376
>>810370
I should add, it's for the python version that biases on 0. These are the expected results (there'll be a similar mix of 0/None for whatever you generate with it):
None
0
0
None
None
None
None
None
None
0
None
0
None
None
None
0
0
None
None
0
0
None
None
0
0
None
None
None
None
0
0
None
None
None
0
0
None
None
None
None
▶ No.810377
>>810375
Doesn't work with my tests too. Time to rerewrite it in Rust Steve.
▶ No.810378
>>810375
No fgt, use the test from >>809582 which generates a biased curve on 0 and compare output with everyone else.
root@sid:~/src# ./809582.py > lolrust.txt && foo/target/release/foo lolrust.txt
▶ No.810380>>810382
>>810367
oh shit i think im retarded and my shit is overflowing. can you run it in debug mode and see if it panics?
▶ No.810381>>810385 >>810387 >>810406
>>810317
first one didn't work, 2nd one did.
>>810351
nice runtime queen but the ranking is worse than your first entry so no update [1 371730 34.688]
latest and greatest
author Language Time, ms Memory, bytes Ranking Points out of 35
808872 c 3 253952 34.783
809731 c++ 12 8760 34.972
810038 c++ 2 706153 34.407
809861 go 933 9961472 25.055
809276 py3 195 5029888 30.461
810221 py2 340 5091500 30.156
810317 perl 93 5940749 29.880
810201 js 179 315392 34.424
▶ No.810382>>810383 >>810384 >>810398
>>810380
root@sid:~/src# ./foo.py > lolrust.txt && foo/target/debug/foo lolrust.txt
thread '<unnamed>' panicked at 'attempt to add with overflow', src/main.rs:109:22
note: Run with `RUST_BACKTRACE=1` for a backtrace.
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Any', src/libcore/result.rs:860:4
thread '<unnamed>' panicked at 'attempt to add with overflow', src/main.rsthread ':<unnamed>109' panicked at ':attempt to add with overflow22thread '',
<unnamed>src/main.rs' panicked at ':attempt to add with overflow102', :src/main.rs20:
102:20
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Any', src/libcore/result.rs:860:4
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Any', src/libcore/result.rs:860:4
▶ No.810383
>>810382
FUGGGG :DDDDDDDDDDDDDDDDDDD
▶ No.810384>>810388 >>810389
>>810382
i fixed it but ultimately doesnt matter. your autism >>810351 >>810358 is faster and i dont care to waste more time on this retarded challenge.
btw your multithreaded version is still shit.
▶ No.810385
>>810381
>but the ranking is worse than your first entry
Such a great site. 1ms in 371k of memory and I lose. I'll game it harder and reduce the ram.
▶ No.810387>>810399
>>810381
well atleast python isn't last.
and the python 3 version is using it's own builtin counter function to do all the work.
maybe performance could be sped up in the python3 version using the same int lookup dict the python2 version does instead of the int cast in sequences()
▶ No.810388>>810390
>>810384
I'd recommend debugging why the multithreaded one runs poorly on your system as the result should be interesting. The only thing that could cause that (since otherwise it's basically running the same code per thread) is an unexpectedly slow CPU thread that delays everything else. It's why I told you to see if disabling affinity changes the result, since the scheduler will move it to another core. Usually this is either some background process eating a core that you didn't know about, or a laptop doing powersaving.
▶ No.810389>>810394
>>810384
>retarded challenge.
>being this mad
▶ No.810390>>810395
>>810388 (Heil Hitler!)
>I'd recommend wasting even more time just so another anon can get internet points
meh. im going to write codeeval to add rust support instead.
▶ No.810394>>810397
>>810389
actually im not mad. im just done. without going full unsafe and basically writing c in rust i wont beat queens solution
▶ No.810395
>>810390
They need to link in pthreads, too. Although threaded solutions will lose on that test as their testcase is too small (it's apparently 10 times smaller than the testcase data we're using and you can't really go faster than 1ms).
▶ No.810397
>>810394
RUST CONFIRMED FOR MEMELANG
▶ No.810398>>810401
>>810382
>panicked
>We went to lunch afterward, and I remarked to Dennis that easily half the code I was writing in Multics was error recovery code. He said, "We left all that stuff out. If there's an error, we have this routine called panic, and when it is called, the machine crashes, and you holler down the hall, 'Hey, reboot it.'"
▶ No.810399>>810402
>>809276
>>810387
using a lookup dict instead of int cast is definitely a 10-15% performance increase on the python3 version
#!/usr/bin/env python3
import sys
from collections import Counter
int_lookup={}
def genfastint():
global int_lookup
for i in range(0,101):
int_lookup[str(i)]=i
def fastint(str_int):
global int_lookup
return int_lookup[str_int]
def sequences(filename):
with open(filename, "r") as f:
for line in f:
yield [fastint(i) for i in line.rstrip().split(",")]
def major(sequence):
if not sequence:
return None
element, count = Counter(sequence).most_common()[0]
if count <= (len(sequence) / 2):
return None
return element
def main():
genfastint()
for sequence in sequences(sys.argv[1]):
print(major(sequence))
if __name__ == "__main__":
main()
▶ No.810401
>>810398
Honestly, although I removed the error handling from what I posted, my real code mostly has error handling that just calls abort(). If it's a case that should never happen, there's no reason to not only pay the cost of handling it, but add another codepath that needs tested.
▶ No.810402
>>810399
that might be enough to beat pajeetscript
▶ No.810406>>810439 >>810683
>>810381
Actually, before I go to the effort to do something fancy with the buffer, can you see what it reports for memory on this:
#include <stdint.h>
#include <sys/mman.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>
#define MAX_L 30000
#define MAX_CASES 40
static char *buffer;
static uint_fast16_t counts[101];
static inline uint_fast16_t parse_n(char **c) {
uint_fast16_t retval = 0;
char *begin = *c;
char *end = begin;
while(*end & 0x10) ++end;
int len = end - begin;
if(len > 2) retval = 100;
else if(len <= 1) retval = begin[0] - '0';
else retval = (begin[0] - '0') * 10 + begin[1] - '0';
*c = end;
return retval;
}
int main(int argc, char *argv[]) {
struct stat stat;
uint_fast16_t n;
uint_fast16_t l;
int i;
int fd = open(argv[1], O_RDONLY);
fstat(fd, &stat);
buffer = mmap(NULL, stat.st_size, PROT_READ, MAP_SHARED, fd, 0);
char *end = buffer + stat.st_size;
char *c = buffer;
goto start;
for(; c < end; ++c) {
(void) memset(counts, 0, sizeof(counts));
start:
for(l = 1; ; ++l, ++c) {
n = parse_n(&c);
++counts[n];
if(*c == '\n') break;
}
uint_fast16_t half = l >> 1;
for(i = 0; i < sizeof(counts) / sizeof(counts[0]); ++i) {
if(counts[i] > half) {
printf("%d\n", i);
break;
}
}
if(i == sizeof(counts) / sizeof(counts[0])) printf("None\n");
}
}
▶ No.810410>>810439
>tfw no ruby solution
File.readlines(ARGV[0]).each do |line|
most = line.split(",").group_by{|i| i }.sort_by{ |k,v| v.length}.reverse.first[1]
if most.length >= line.split(",").length.to_i / 2
puts most.first
else
puts "None"
end
end
▶ No.810416>>810425 >>810515
>>810358
I really hate it when the compiler insists on doing it wrong. I couldn't fix the loop with clang but I've managed to get clang to beat gcc (both with -O2) by rewriting it and assuming we can read some bytes past the 0 (which the compiler cannot take for granted). Timings:
8bits (original)
gcc: 37371347 ns
clang: 46252361 ns
32bits
gcc: 28575093 ns
clang: 14820225 ns
64bits
gcc: 25180917 ns (better than 32)
clang: 18220982 ns (worse than 32)
char *foo(char *c){
int64_t *d = (int64_t*)c;
while(1){
switch(*d&0x1010101010101010){
case 0x1010101010101010: ++d; break;
case 0x0010101010101010: return ((char*)d)+7;
case 0x0000101010101010: return ((char*)d)+6;
case 0x0000001010101010: return ((char*)d)+5;
case 0x0000000010101010: return ((char*)d)+4;
case 0x0000000000101010: return ((char*)d)+3;
case 0x0000000000001010: return ((char*)d)+2;
case 0x0000000000000010: return ((char*)d)+1;
}
}
return d;
}
char *foo(char *c){
int *d = (int*)c;
while(1){
switch(*d&0x10101010){
case 0x10101010: ++d; break;
case 0x00101010: return ((char*)d)+3;
case 0x00001010: return ((char*)d)+2;
case 0x00000010: return ((char*)d)+1;
default: return d;
}
}
return d;
}
Asm (int version only):
gcc:
foo:
.LFB0:
.cfi_startproc
movl (%rdi), %edx
andl $269488144, %edx
cmpl $4112, %edx
je .L3
.L13:
jle .L12
cmpl $1052688, %edx
je .L6
cmpl $269488144, %edx
jne .L9
addq $4, %rdi
movl (%rdi), %edx
andl $269488144, %edx
cmpl $4112, %edx
jne .L13
.L3:
leaq 2(%rdi), %rax
ret
.p2align 4,,10
.p2align 3
.L6:
leaq 3(%rdi), %rax
ret
.p2align 4,,10
.p2align 3
.L12:
cmpl $16, %edx
leaq 1(%rdi), %rax
jne .L9
rep ret
.p2align 4,,10
.p2align 3
.L9:
movq %rdi, %rax
ret
clang:
foo: # @foo
.cfi_startproc
# BB#0:
movl $269488144, %eax # imm = 0x10101010
jmp .LBB0_1
.p2align 4, 0x90
.LBB0_7: # in Loop: Header=BB0_1 Depth=1
addq $4, %rdi
.LBB0_1: # =>This Inner Loop Header: Depth=1
movl (%rdi), %ecx
andl %eax, %ecx
cmpl $269488143, %ecx # imm = 0x1010100F
jle .LBB0_2
# BB#6: # in Loop: Header=BB0_1 Depth=1
cmpl $269488144, %ecx # imm = 0x10101010
je .LBB0_7
jmp .LBB0_10
.LBB0_2:
cmpl $16, %ecx
je .LBB0_9
# BB#3:
cmpl $4112, %ecx # imm = 0x1010
je .LBB0_8
# BB#4:
cmpl $1052688, %ecx # imm = 0x101010
jne .LBB0_10
# BB#5:
addq $3, %rdi
movq %rdi, %rax
retq
.LBB0_9:
incq %rdi
.LBB0_10:
movq %rdi, %rax
retq
.LBB0_8:
addq $2, %rdi
movq %rdi, %rax
retq
▶ No.810425
>>810416
Nice! It also points to this being a problem solvable in the same way strlen() does it with vector operations to remove the conditionals (it has several versions of assembly code that test for 0 in multiple bytes simultaneously).
▶ No.810439>>810441 >>810448
>>810406
its done my queen
>>810410
probably the shortest solution i think
Reordered by score
author Language Time, ms Memory, bytes Ranking Points out of 35
809731 c++ 12 8760 34.972
810406 c 1 32768 34.971
810038 c++ 2 706153 34.407
810201 js 179 315392 34.424
810410 ruby 221 794624 33.950
809276 py3 195 5029888 30.461
810221 py2 340 5091500 30.156
810317 perl 93 5940749 29.880
809861 go 933 9961472 25.055
▶ No.810441>>810447
>>810439
>12 times slower and gets the highest score
Fucking hell. It might be impossible to get 1ms under 8k of ram. Not sure how much headroom there is on that. I'll try it.
▶ No.810447>>810451
>>810441
did you consider just accepting c++ as the superior language?
▶ No.810448>>810458 >>810463
>>810439
Where is the common lisp code here >>810262
What a fucking faggot.
>/tech/ in 2017
▶ No.810451>>810453
>>810447
Technically I already have. I do most of my work in C++. But I never touch slow and broken shit like string, streambuf, or fstream.
▶ No.810453>>810456
>>810451
what are your suggested alternatives?
▶ No.810456>>810457
>>810453
C strings, C functions. It's all still valid C++ and they aren't full of mistakes (other than for wide char functions).
▶ No.810457>>810464
>>810456
Are you suggesting to do all IO in C++ via C-style file functions?
▶ No.810458
▶ No.810463
>>810448
Scheme but no Common lisp?
Suck my dick.
Even this piece of shit of guile is here.
▶ No.810464>>810468
>>810457
Yes. You have to anyway for anything non-trivial as there are no C++ bindings to most things.
▶ No.810468>>810470
>>810464
>as there are no C++ bindings to most things.
explain further...
▶ No.810470>>810474
>>810468
Try to write some networking code in C++, for example. There's nothing standard (at least that I know of, but I currently only care about C++14). So you're either adapting C++ to C yourself, or using a library that does. You'll quickly find that libraries for this only go so far. As soon as you start doing something that isn't pleb-tier there are no bindings. Even when there are bindings, you're screwed as you can't pass data as a std::string without copying it. This quickly results in massive amounts of dynamic memory flying around and lots of little ones in remote corners of the code as you do getsockopts and ioctls or whatever that need string data passed around. You could wait for std::string_view, but as I mentioned in another thread, that's still a pretty shitty solution.
So, the simple solution to all this nonsense: just use C-style strings. C++ strings were a mistake and can be ignored.
▶ No.810473>>810502
>>810471
boost is pure cancer. If you want to kill a C++ project, use it liberally.
▶ No.810474>>810476
>>810470
>C-style strings
exploits ahoi!
▶ No.810476
>>810474
C++ strings don't really do anything for you re exploit prevention. They're actually a bit worse as in many (most?) implementations the actual buffer is on the heap even when declared on the stack. At least on stack there is the possibility of stack smashing protection blocking an exploit.
▶ No.810480>>810491
Yo queen, did you go to college to get your first job?
▶ No.810485>>810489 >>810682 >>811055
#!/usr/bin/env python3
from sys import argv
from collections import Counter
for line in open(argv[1]):
nums = [int(x) for x in line.split(',')]
[(major, count)] = Counter(nums).most_common(1)
print(major if count > len(nums) / 2 else None)
▶ No.810489>>810490
>>810485
is this a new entry or an updated one by an old contestant?
▶ No.810490
>>810489
A new one. I went in blind and wrote it before looking at the other entries. It's not optimized for anything, just the most straightforward way I could think of. Might be useful as some kind of baseline.
▶ No.810491
>>810480
I went to college (MS in CS, UC system), but I've never been asked for my resume. I'd get offers through random things I'd be doing and I'd take interesting ones. Recently, I've been working both networking and gamedev. I have no background at all in gamedev, so it's been an interesting ride.
▶ No.810496>>810499 >>810932
I tried but I suck at coding.
I didn't really try to optimize it (don't know how really), but there's definitely room for improvement.
But I think my algorithms a bit different. It should work for an unspecified bucket size as long as it's an int. There's probably some way to make the algorithm work without the second pass through, but I'm not sure how.
#include <stdio.h>
#include <stdlib.h>
#define MAXLEN 30000
int major_element(int *, int);
int main(int argc, char *argv[]){
FILE *fp;
char c;
char snum[6];
char *p = snum;
int numbers[MAXLEN];
int len = 0;
int major;
if(argc < 2){
printf("Need an argument\n");
return 0;
}
fp = fopen(argv[1], "r");
if(fp == NULL){
printf("couldn't open file\n");
return 0;
}
while((c = fgetc(fp)) != EOF){
if('0' <= c && c <= '9'){
*p++ = c;
}
else if('c' == ' '){
}
else if(c == ','){
*p++ = '\0';
p = snum;
numbers[len++] = atoi(snum);
}
else if(c == '\n'){
*p++ = '\0';
p = snum;
numbers[len++] = atoi(snum);
major = major_element(numbers, len);
if(major == 0){
printf("None\n");
}
else{
printf("%d\n", major);
}
len = 0;
}
else{
return 0;
}
}
fclose(fp);
return 0;
}
int major_element(int *list, int len){
int count = 0;
int major = 0;
int occ = 0;
int *listcp = list;
for(int i = 0; i < len; i++){
if(listcp[i] == major){
count++;
}
else if(count > 0){
count--;
}
else{
count = 1;
major = listcp[i];
}
}
for(int i = 0; i < len; i++){
if(list[i] == major){
occ++;
}
}
if(occ > len/2){
return major;
}
return 0;
}
This probably shouldn't have been written in C if all I cared about was not using a bucket
▶ No.810499
>>810496
already see something I should have done.
Should have initialized "char snum = ['\0', '\0', '\0', '\0', '\0', '\0'].
▶ No.810502>>810509 >>810671
>>810473
>>810472
zero arguments anytime
Use of high-quality libraries like Boost speeds initial development, results in fewer bugs, reduces reinvention-of-the-wheel, and cuts long-term maintenance costs.
"The obvious solution for most programmers is to use a library that provides an elegant and efficient platform independent to needed services. Examples are BOOST..."
--- Bjarne Stroustrup, Abstraction, libraries, and efficiency in C++
▶ No.810509>>810514
>>810502
>smug anime girl
Every time.
I used boost for 10 years. I just recently finished removing all of it from all our code. Go have fun making mistakes.
▶ No.810514
>>810509
>boost was good enough to use for 10 years
>reinvent the wheel and rewrite it all yourself
>this will definitely be better than boost
Go have fun downgrading your code.
▶ No.810515>>810518 >>810667 >>810683 >>810798
>>810416
So I'm halfway through a workday and I've done no work so I should probably stop now, but I'm going to leave you with this. This is a modification to use something similar to your technique that you might find interesting. First, the code:
#include <stdint.h>
#include <sys/mman.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>
#define MAX_L 30000
#define MAX_CASES 40
static char *buffer;
static uint_fast16_t counts[101];
static inline uint_fast16_t parse_n(char **c) {
uint_fast16_t retval;
char *begin = *c;
uint32_t v = *((uint32_t *) *c);
v &= 0x00101010;
if(v == 0x00101010) { *c += 3; return 100; }
else if(v == 0x00100010) { *c += 1; return retval = begin[0] - '0'; }
else {*c += 2; return (begin[0] - '0') * 10 + begin[1] - '0'; }
}
int main(int argc, char *argv[]) {
struct stat stat;
uint_fast16_t n;
uint_fast16_t l;
int i;
int fd = open(argv[1], O_RDONLY);
fstat(fd, &stat);
buffer = malloc(stat.st_size + 3);
read(fd, buffer, stat.st_size);
// Magic.
buffer[stat.st_size] = '\\';
// Prevent valgrind from being scared of my huge feminine dick.
buffer[stat.st_size + 1] = '\0';
buffer[stat.st_size + 2] = '\0';
char *end = buffer + stat.st_size;
char *c = buffer;
goto start;
for(; c < end; ++c) {
(void) memset(counts, 0, sizeof(counts));
start:
for(l = 1; ; ++l, ++c) {
n = parse_n(&c);
++counts[n];
if(*c == '\n') break;
}
uint_fast16_t half = l >> 1;
for(i = 0; i < sizeof(counts) / sizeof(counts[0]); ++i) {
if(counts[i] > half) {
printf("%d\n", i);
break;
}
}
if(i == sizeof(counts) / sizeof(counts[0])) printf("None\n");
}
}
This is disgustingly fast, if the compiler gets it right.
>user 0m0.000s
The parse_n gets really interesting here (and probably needs an ifdef for endianness, but I didn't bother), and the magic in the buffer, too. Basically, parse_n is now reading 4 bytes at a time and checking for a bit that matches digits, but there's a couple difficult cases here. First, these are the possible strings as seen by parse_n and if the bit would be set per character (second column). 'X' is a wildcard:
length 1:
9,9X 101X
9\n9X 101X
9\n\0X 100X
length 2:
99\nX 110X
99,X 110X
length 3:
999X 111X
So first, we can just AND off 'X' and ignore it, so we're really only considering 3 characters. Note that there are two possible values for length 1; '101' and '100', where the terminating null we'd usually place at the end of the string gives us some trouble. Now we could write an if / else if / else statement where length 1 is handled in the else so we don't have to do two comparisons for it, but the problem with this is the non-jumping portion of the resulting assembly will be the fastest due to how modern x86 processors work. And length 1 is a very rare case, roughly 10% of the numbers. We really want length 2 to be handled by the else, but then we have to do two comparisons for length 1.
The solution is to not null terminate the string. We can terminate it with something that has the 0x10 bit set instead. Then, when we read past the end of the string when considering a last digit that is length 1, it thinks it is still the '101' case. That lets us change the code to have the else statement handle the length 2 case without introducing another comparison.
The other two extra bytes in the buffer don't matter but I clear to make valgrind not complain. In real code you shouldn't do that as it wastes time just to make a tool happy, you should use the VALGRIND_MAKE_MEM_DEFINED macro, but that requires feature tests (yay autotools) so I didn't include it.
▶ No.810518
>>810515
Btw, it's not worth testing this one since it uses more ram to ensure it can do that termination trick on the buffer, just thought it'd be interesting to show you.
▶ No.810581
▶ No.810667>>810669
>>810515
>uint32_t v = *((uint32_t *) *c);
uuuuuuh excuse me? is this allowed? dont you have to make sure that it is aligned and shit? also wont you get different results depending on endianness?
please explain
▶ No.810669
>>810667
As I mentioned, I'm blowing off endianness and issues with meme hardware in that version so it stays simple. A production version would ifdef it. Just thought that anon would be interested in where he could take his AND-based version.
▶ No.810671>>810694
>>810502
>Bjarne Stroustrup
who the fuck is that even? some no name shitter shilling for boost
▶ No.810682>>810683
>>810485
i've ran out of ideas trying to do anything with
>>810221
the only thing i can think of is just use the collections library which would make it identical to the python3 version. the answer is just don't use python, or concentrate on multi-thread performance over single-threaded.
if this was a real life problem the data would probably already be in some kind of sql database and the sql solution would just be done
▶ No.810683>>810690
>>810515
>>810406
>>810682
would these C solutions scale to data that exceeds them systems ram i wonder?
▶ No.810690
>>810683
If you wanted to remove the problem's limit on the number of test cases or the length of a test, the mmap() version would scale as the backing is managed by the OS and responds to memory pressure (unless you mprotect it, which is almost always a bad idea). If you also wanted to remove the "a filename" restriction (e.g., to feed it a potentially gargantuan amount of data larger than your filesystem can handle), at the price of some performance, you could convert it to a dual-buffer setup with two bytes of overlap (ringbuffers would be slower). That was my plan before I decided I needed to do actual work today.
▶ No.810694
>>810671
pajeet detected. we don't tolerate your kind around here
▶ No.810729>>810738 >>810742 >>810743
>>810221
>>810275
300% python2 boost
i wanted to get some experience with numba so i rewrote this solution for it, which isn't really the same solution at all now. i doubt the pajeet site will run it anyway, numba isn't part of the standard library, you can get it with pip though.
this solution uses numpy and numba (numba uses numpy datatypes). on my test data the original was 1.8s, this version is 0.55s. I don't know why it's 1.8s now instead of 1.1s, i must have fucked with the generator i was using. either way, it's a 300% performance increase. I took out all the multiprocessing, but it would benefit from it with n cores/sequences. it reads the data twice, first to split on then on lines and then each individually. each line could be sent to an individual process.
I thought this numba thing was going to be a meme but it really is a shit ton faster and not really that complicated to get going once you get a hang for what data types it accepts and what it's limits are. i'm sure i'll use it in the future.
#!/bin/python2.7
import argparse
import numba
import numpy as np
@numba.jit("uint16[:](uint8[:])",locals={ 'integer':numba.uint8,'nparray_count':numba.uint16[:] })
def countNumbers(nparray_seq):
#define return array at top for high performance loop, nopython=True doesn't
#support returning arrays but will compile the loop using it's fast dataypes
#if the array is declared beforehand.
nparray_count=np.zeros((1,102),dtype=np.uint16)[0]
for integer in nparray_seq:
nparray_count[integer]+=1
return nparray_count
@numba.jit("uint16(uint16[:])",nopython=True,locals={'total_num':numba.uint16} )
def findTotalNum(nparray_count):
total_num=0
for x in range(102):
total_num+=nparray_count[x]
return total_num
@numba.jit("uint8(uint16[:])",nopython=True, locals={'target_num':numba.uint16,'major':numba.uint8,
'index':numba.uint8,'count':numba.uint16 } )
def findMajor(nparray_count):
target_num=findTotalNum(nparray_count)/2
major=102 #no None->numba.uint8, use 102
for index,count in enumerate(nparray_count):
if count>=target_num:
major=index
return major
def main():
parser=argparse.ArgumentParser()
parser.add_argument('filename',type=str)
args=parser.parse_args()
try:
sequences=open(args.filename,'r').readlines()
except:
print 'None'
for seq in sequences:
nparray_seq=np.fromstring(seq,dtype=np.uint8,count=-1,sep=',')
nparray_count=countNumbers(nparray_seq)
major=findMajor(nparray_count)
if major==102: #no None for numba.uint8
print 'None'
else:
print major
if __name__=="__main__":
main()
▶ No.810731
basically numba doesn't handle a lot of shit, but it handles the majority of what you would be doing inside of a loop anyway, just not any string manipulation, it has to be done on numbers.
just import jit and putting
@numba.jit
infront of every function definition is notthe ticket to performance. if you do that it'll try to predict what's going on and speed things up, but it'll fall back into it's pyobject mode which is slower than straight python if you don't write code specifically with numba in mind.
▶ No.810732
>>810730
i tried to delete this immediately to replace "is the ticket to performance" to " is not the ticket to performance" but now its bitching about wrong password
▶ No.810738
>>810729
also it's not really useful if the loop isn't significant, or if the loop isn't getting called more than once because there's numba compile overhead.
▶ No.810742
>>810729
enabling caching takes it down to 0.35-0.4s. caching compiles once to a file on first run, and if you run it again it'll read that instead of compiling and remove the overhead. measuring timing inside python it only reports a total time of 0.04s, not 0.35s with "time python2.7 ...". I don't know where the extra 0.3s is coming from, i guess it's io or something.
if python's internal timing is right it's only 40ms on my machine with worst case testdata, which should put it above all the other scripting languages. it's kind of cheating though, this library is pretty much using it's own jit instead of python's.
▶ No.810743>>810755 >>811087
>>810729
with caching enabled, gil disabled on nopython numba code, and multiprocessing, and i added a break in findMajor() I forgot to do.
#!/bin/python2.7
import argparse
import numba
import numpy as np
from multiprocessing import Pool
import time
@numba.jit("uint16[:](uint8[:])",cache=True,locals={ 'integer':numba.uint8,'nparray_count':numba.uint16[:] })
def countNumbers(nparray_seq):
#define return array at top for high performance loop, nopython=True doesn't
#support returning arrays but will compile the loop using it's fast dataypes
#if the array is declared beforehand.
nparray_count=np.zeros((1,102),dtype=np.uint16)[0]
for integer in nparray_seq:
nparray_count[integer]+=1
return nparray_count
@numba.jit("uint16(uint16[:])",nopython=True,nogil=True,cache=True, locals={'total_num':numba.uint16} )
def findTotalNum(nparray_count):
total_num=0
for x in range(102):
total_num+=nparray_count[x]
return total_num
@numba.jit("uint8(uint16[:])",nopython=True,nogil=True,cache=True, locals={'target_num':numba.uint16,'major':numba.uint8,
'index':numba.uint8,'count':numba.uint16 } )
def findMajor(nparray_count):
target_num=findTotalNum(nparray_count)/2
major=102 #no None->numba.uint8, use 102
for index,count in enumerate(nparray_count):
if count>=target_num:
major=index
break
return major
def worker(seq):
nparray_seq=np.fromstring(seq,dtype=np.uint8,count=-1,sep=',')
nparray_count=countNumbers(nparray_seq)
major=findMajor(nparray_count)
if major==102: #no None for numba.uint8
return 'None'
else:
return major
def main():
total_time=time.time()
parser=argparse.ArgumentParser()
parser.add_argument('filename',type=str)
args=parser.parse_args()
try:
sequences=open(args.filename,'r').readlines()
except:
print 'None'
p=Pool(1)
results=p.map(worker,sequences)
for result in results:
print result
if __name__=="__main__":
main()
▶ No.810755>>810758 >>810762 >>810809
>>810743
thats a lot of messy code and imports considering its still not as fast as the c/c++ solutions.
was python a mistake?
▶ No.810758
>>810755
if your going to spend the time to optimize everything for numba you might as well write the fucking thing in C and have no restrictions with all the speed. I guess if you have something big in python and only need to speed up a couple of loops it's worth it.
▶ No.810762>>810773 >>810777
>>810755
>was python a mistake?
It has its own uses. You wouldn't want to write anything dealing with large amounts of strings, i.e. a web application/CGI in C/C++. It's useful when there's no performance requirement. It's good for quick prototyping. But it's pretty slow on my Android phone, to the point where the environment takes .2s to boot.
▶ No.810763>>810764 >>810881
Updates
author Language Time, ms Memory, bytes Ranking Points out of 35
809731 c++ 12 8760 34.972
810406 c 1 32768 34.971
810496 c 17 568 34.970
810038 c++ 2 706153 34.407
810201 js 179 315392 34.424
810410 ruby 221 794624 33.950
810485 py3 196 5017600 30.470
809276 py3 195 5029888 30.461
810221 py2 340 5091500 30.156
810317 perl 93 5940749 29.880
809861 go 933 9961472 25.055
▶ No.810764
>>810763
>go is the worst
LOL
▶ No.810765>>810767
im sure rust would be last place if it ever managed to run.
afaik the solution posted in this thread crashed or something :^)
▶ No.810767
>>810765
rustfag rage quit
▶ No.810773>>810796
>>810762
this is the biggest performance impact on the numba code now. i simplified and compressed the code a bit but it's not noteworthy enough to post.
def main():
t_start=time.time()
total_time=time.time()
parser=argparse.ArgumentParser()
parser.add_argument('filename',type=str)
args=parser.parse_args()
try:
sequences=open(args.filename,'r').readlines()
except:
print 'None'
quit()
p=Pool(1)
results=p.map(worker,sequences)
for result in results:
print result
print 'total time:'+str(time.time()-t_start)
time python2.7 challenge_new.py testdata
...
total time:0.0445680618286
real 0m0.406s
user 0m0.348s
sys 0m0.296s
this 350ms difference has to be python's initialization time.
▶ No.810777>>810779
>>810762
>It has its own uses. You wouldn't want to write anything dealing with large amounts of strings, i.e. a web application/CGI in C/C++.
Why not?
▶ No.810779
>>810777
this challenge is dealing with large amounts of strings, 40, 100k+ character long strings
▶ No.810796>>810813
>>810773
it's the imports
def main():
print "python is shit"
if __name__=="__main__":
main()
time python2.7 shit.py
real 0m0.024s
user 0m0.020s
sys 0m0.004s
add
import argparse
import numba
import numpy as np
from multiprocessing import Pool
real 0m0.289s
user 0m0.232s
sys 0m0.196s
▶ No.810797>>810803 >>810932
just learn c or c++ already python anon
▶ No.810798>>810846
>>810515
>Now we could write an if / else if / else statement where length 1 is handled in the else so we don't have to do two comparisons for it, but the problem with this is the non-jumping portion of the resulting assembly will be the fastest due to how modern x86 processors work.
Doesn't branch prediction deal with this?
▶ No.810799>>810856 >>810913
Now it's my turn!!!!
I am a pedophile, does that give me special points?
I did not look at any of the solutions in this thread, written from scratch.
If I win, I want a little girl as a prize.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
uint8_t lookup [58] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
uint8_t lookup2 [58] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 20, 30, 40, 50, 60, 70, 80, 90 };
uint16_t counts[101];
int main(int argc, char *argv[])
{
FILE* fp = fopen(argv[1], "r");
int c;
int position = 0;
char buffer[3];
uint8_t cur_num = 0;
uint16_t length = 0;
uint16_t target_length;
memset(counts,0,101*sizeof(unsigned short));
while(1)
{
c = fgetc(fp);
if((c == ',') || (c == '\n') || (c == EOF))
{
if(position == 2)
cur_num = lookup2[buffer[0]] + lookup[buffer[1]];
else if(position == 1)
cur_num = lookup[buffer[0]];
else if(position == 3)
cur_num = 100;
counts[cur_num]++;
position = 0;
length++;
}
else
{
buffer[position] = (char)c;
position++;
}
if((c == '\n') || (c == EOF))
{
target_length = length/2;
int i;
for(i=0; i<=100; i++)
{
if(counts[i] > target_length)
{
printf("%d\n", i);
break;
}
if(i==100)
printf("None\n");
}
memset(counts,0,101*sizeof(unsigned short));
length = 0;
}
if(c == EOF)
break;
}
fclose(fp);
return 0;
}
The lookup and lookup2 tables could be removed and calculation used instead, that would save small number of bytes in memory.
Those shit tables was to be optimization, but maybe they slow down rather than speed up? Adding and multiplying by 10 should be cheap process, but is reading from table a cheap process?
▶ No.810803>>810805
>>810797
that's the only solution, but i didn't realize how slow those imports were.
if your going to do a website in django, with all that shit it's going to import and whatever else your importing, it doesn't matter how fast your python code is, the response time isn't going to be any faster than those imports, which could be 300m or worse by themselves.
▶ No.810805>>810849
>>810803
i've done django before on some things, and as far as I know it's a fresh launch on that python every single time, so those imports are going to run and lag every single request.
▶ No.810809
>>810755
It's much easier to write working things in Python than in C/C++. There's a tradeoff between running time and development time, and Python is often the most sensible choice. Especially if you're the only user, or if the bottleneck is going to be outside your code either way (in the network, for example).
If you're writing a large application then it's possible that only 1% of your code is performance-sensitive. Optimizing that could be better than using a faster, harder language for the entire application. Writing only the performance-sensitive part in a faster language is possible too, but it really depends on the situation which approach is best. Writing fast code in a slow language can come in handy.
Not all of this is based on personal experience so take it with a grain of salt
▶ No.810813>>810822
>>810796
to be fair to python i benchmarked each individual import.
argparse:0.00708198547363
numba:0.185921907425
numpy:1.90734863281e-06
multiprocessing:6.91413879395e-06
argparse of all things took the longest out of the standard library, but none of them are as terrible as numba itself, which takes 185ms.
▶ No.810822>>810826
>>810813
>argparse of all things took the longest out of the standard library
It's because multiprocessing is lazy with importing.
The two standard library modules here are argparse and multiprocessing.
$ python3 -m timeit -s 'import multiprocessing, importlib' 'importlib.reload(multiprocessing)'
10000 loops, best of 3: 180 usec per loop
$ python3 -m timeit -s 'import argparse, importlib' 'importlib.reload(argparse)'
1000 loops, best of 3: 940 usec per loop
argparse is a simple module. It's just a single file, argparse.py, 89K and almost entirely function and class definitions. When you import argparse that file gets evaluated. Simple enough.
multiprocessing is a more full-featured module. It's a directory with a init.py and a number of other modules, including pool.py. init.py adds modules from the context module (context.py). It takes them from a BaseContext object, which has a lot of method definitions, one of which is this:
def Pool(self, processes=None, initializer=None, initargs=(),
maxtasksperchild=None):
'''Returns a process pool object'''
from .pool import Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
context=self.get_context())
When you only import and use Pool from multiprocessing, only the scaffolding submodules and the actual pool.py submodule get evaluated. But the pool.py module doesn't even get evaluated until you create a Pool object for the first time (every time after that the import is cached). Notice what happens if we force the import of pool.py:
$ python3 -m timeit -s 'from multiprocessing import pool; import importlib' 'importlib.reload(pool)'
1000 loops, best of 3: 412 usec per loop
Or re-importing the scaffolding as well:
$ python3 -m timeit -s 'from multiprocessing import pool; import multiprocessing, importlib' 'importlib.reload(multiprocessing); importlib.reload(pool)'
1000 loops, best of 3: 612 usec per loop
Importing Pool from multiprocessing is still faster than importing argparse (as you'd expect, because pool.py is just 26K), but it's not that much faster. multiprocessing does well in your benchmark because it only imports what's necessary and delays it until you use it.
▶ No.810826
>>810822
thanks for the clarification, i was wondering what the fuck was going on there with argparse taking way longer than like multiprocessing.
▶ No.810846
>>810798
A linear path is usually faster regardless of prediction. It really depends on the processor as this changes dramatically every time Intel touches anything and there are a lot of moving parts. Example of dramatic change: unaligned access is essentially free on i7.
▶ No.810849
>>810805
Python websites are usually webapps because of how absurdly slow it is to start. With django, you'd normally host it as WSGI hosted in an appserver like uwsgi talking to nginx. Then the interpreter only loads once and then again every X requests to fix Python bloating up like a dead fish over time. While you could run it as CGI, it would be unusable for anything more than debugging.
▶ No.810856>>810894
>>810799
This uses the exact same amount of memory as that earlier solution by another anon. I guess the memory detect script on the pajeet site broke.
▶ No.810881>>810884 >>810892
>>810763
So basically, everything other than C and C++ is a meme language.
▶ No.810884
▶ No.810885>>810891
Where is Java solution???
▶ No.810891>>810893
>>810885
It's still starting up.
▶ No.810892
>>810881
at maximum autism this is the fastest /tech/ can make an interpreted language go on a simple problem.
▶ No.810893
>>810891
my sides were not prepared for that
▶ No.810894>>810895
>real 0m0.406s
>user 0m0.348s
>sys 0m0.296s
How do you people get that? What should I click to get this numbers?
>>810856
>This uses the exact same amount of memory as that earlier solution by another anon. I guess the memory detect script on the pajeet site broke.
so you upload and tested it?
show results and post updated result list
▶ No.810895>>810898 >>810913
>>810894
"time", in the shell. Alternatively, /usr/bin/time is a different tool that shows some useful stuff in --verbose, but has shit resolution. perf is also useful.
▶ No.810898>>810899 >>810900
>>810895
time also gives different output if you have nice infront of it i discovered.
time python2.7 challenge_new.py testdata
real 0m0.412s
user 0m0.336s
sys 0m0.176s
nice -20 time python2.7 challenge_new.py testdata
total time:0.0473449230194
0.36user 0.29system 0:00.44elapsed 149%CPU (0avgtext+0avgdata 74708maxresident)k
0inputs+32outputs (0major+21093minor)pagefaults 0swaps
i don't know how or why that is, time has a format option but I don't see any reference to it in nice
▶ No.810899
>>810898
ignore "total time" that's coming from python i should have deleted that line
▶ No.810900>>810907 >>810916
>>810898
>time also gives different output if you have nice infront of it i discovered.
That's because the first command uses the shell built-in, and the second command uses /usr/bin/time by resolving $PATH.
nice can't access shell built-ins because it's not part of your shell. Try running "env time". It'll give the same kind of output as "nice -20 time".
Running "time nice" instead of "nice time" should let you use the built-in.
▶ No.810904>>810908
how can you see the memory that a process used?
▶ No.810907
>>810900
Can also do something like nice bash -c "time ..."
▶ No.810908>>810909
>>810904
That's actually a difficult question due to the definition of "memory that a process used". I'd recommend /usr/bin/time --verbose's resident set size, but that's still a pretty squishy number.
▶ No.810909>>810911
>>810908
can you just do
nice time --verbose ./a.out
its easier to type than the usr/bin thing
▶ No.810911>>810926
>>810909
You can, but use "env" instead of "nice". It's the usual tool used for that purpose and it doesn't have side effects.
▶ No.810913>>810925
>>810895
>"time", in the shell. Alternatively, /usr/bin/time is a different tool that shows some useful stuff in --verbose, but has shit resolution. perf is also useful.
what do you mean by shell? windows explorer or cmd?
when I clicked "time" in cmd it shows time and asks me to set new time
can someone test my code >>810799 to show real user sys time?
▶ No.810916>>810920
>>810900
wait so time run straight from bash is bash's own version of time, it's not hitting /usr/bin/time, or whatever is in my path. that's some bullshit.
▶ No.810919
man bash
RESERVED WORDS
Reserved words are words that have a special meaning to the shell. The following words are recognized as reserved when unquoted and either the
first word of a simple command (see SHELL GRAMMAR below) or the third word of a case or for command:
! case coproc do done elif else esac fi for function if in select then until while { } time [[ ]]
one of these should change the name
▶ No.810920
▶ No.810922>>810927 >>811415
>>810262
How do you test this shit? The level of autism to just get lisp implementations to compile something is ridiculous. I used sbcl (this is the 'fast' one, right?), did (load "whatever") then (main "test.txt") (from the python test case generator) and it took 84 seconds. Is that expected or is there some sort of "don't go horribly fucking slow" flag I missed? I also tried compiling it to an executable like,
>(sb-ext:save-lisp-and-die "lispshit" :toplevel #'main :executable t)
but the result wouldn't take a command line argument as far as I could tell.
Lisp was a mistake.
▶ No.810925
>>810913
you're baiting aren't you?
real 0m0.017s
user 0m0.000s
sys 0m0.000s
▶ No.810926>>810963
>>810911
I'd not recommend using env for anything. That program was a mistake.
▶ No.810927>>810929 >>811416
>>810922
Also, the lisp version in addition to taking 84 seconds required 99 megs of memory.
▶ No.810929>>810931
>>810927
>>810262
>>810239
i want to see this on the list
▶ No.810931
>>810929
I don't know if the site OP uses supports it. I think they were arguing above about that. But I wanted to see some numbers for it so I did it myself.
▶ No.810932>>810938 >>811113
>>810797
lol >>810496 doesn't work. So congrats on your fucking testing abilities pajeet.
▶ No.810963>>810978
>>810926
I think env is a horrible hack, but what would you use instead?
"#!/usr/bin/env interpreter" is awful, but so is anything that doesn't actually work.
▶ No.810978>>810991
>>810963
Give the standardized path to the program. Part of LSB autism was to remove need for env hacks.
▶ No.810991>>810994
>>810978
That's nice in theory, but LSB only covers Linux, and most distros have given up the pretense of following it.
It's a gain in compatibility, a loss in purity, but no loss in functionality or functional correctness. It's better than not doing it.
▶ No.810994
>>810991
I've not had to use env in probably 15 years. I'd re-evaluate what you're doing if you find it necessary.
▶ No.811008>>811023 >>811028
>>808650 (OP)
>>808655
>>808660
>"Coding challenges for the world's best developers."
>Posting baby-tier problems that you should be able to solve in your first week of programming.
>Has to look for other people to solve it for him.
>Bonus for rust.
Do your own homework, faggot.
▶ No.811023
>>811008
>no fun allowed
>lets all larp about programming but never do any actual programming
▶ No.811028
>>811008
Even a babby-tier problem can be autistically optimized as a challenge. You afraid to compete, anon?
▶ No.811037
i've been thinking about writing this in bash but i don't think i have the motivation for that
▶ No.811055>>811057 >>811064
>>810485
I made this almost three times as fast by keeping the numbers as strings instead of converting them:
from sys import argv
from collections import Counter
for line in open(argv[1]):
nums = line.strip().split(',')
[(major, count)] = Counter(nums).most_common(1)
print(major if count > len(nums) / 2 else None)
I also checked if reading the file as plain bytes instead of unicode strings made it faster, but that made it a bit slower for some reason.
▶ No.811057>>811061 >>811062
>>811055
Python3 likely converts it to unicode when passed to counter. How does that run on python2?
▶ No.811061
>>811057
just checked, runs fine on 2.7 with zero modifications.
▶ No.811062>>811071 >>811072
>>811057
Counter is a dict subclass. It takes anything hashable, not just strings, and Python is strongly typed, so no unicode conversion. The implementation is simple enough that I'm experimenting with just doing the counting manually.
I tried Python 2, and to my surprise, it was more than two times as slow. Using unicode in Python 2 doesn't help.
▶ No.811064
>>811055
counter probably doesn't care, it'll count any data type in any list
▶ No.811071>>811087
>>811062
>The implementation is simple enough that I'm experimenting with just doing the counting manually.
I just discovered that I was looking at a pure Python fallback, and CPython runs a C implementation of the critical function, so this is not going to happen.
All the heavy lifting is done through things implemented in C at this point so I don't think it'll get much faster without witchcraft.
▶ No.811072>>811086
>>811062
>I tried Python 2, and to my surprise, it was more than two times as slow.
Weird - python2 is usually faster. Maybe you've discovered the one case that justifies the existence of python3.
▶ No.811086
>>811072
I think it's because Python 2's Counter doesn't have the bottleneck written in C. Specifically, it's these two lines:
for elem in iterable:
self[elem] = self_get(elem, 0) + 1
In Python 3 they're moved into a separate function that's replaced by a function of the same name imported from _collections if possible.
Python 3 is a much better language than Python 2, but performance improvements are usually because of things that could be backported to Python 2.
▶ No.811087>>811091
>>811071
>witchcraft
>>810743
this will do it in 40ms if i can somehow reduce numba's 200ms import time
▶ No.811089
>>809485
That guy isn't me. The 99 thing was a joke about the interval notation. Only pretending to be retarded, etc..
▶ No.811091
>>811087
you can seperate the numba shit and compile it to a module that can be imported without numba installed, i've thought about compiling it seperatly, compressing it to a string, including it in the source, and then having it decompress the string into a python tempfile, which is hopefully in ram, and importing the module from there, but i'd bet it's going to take longer than 200ms and not be worth the effort
▶ No.811099>>811100 >>811103
>>809866
>>809861
>933
wtf?
real 0m0.001s
user 0m0.000s
sys 0m0.002s
Where does 933 come from?
▶ No.811100>>811101
>>811099
What happens if you use go run?
▶ No.811101
>>811100
real 0m0.132s
user 0m0.140s
sys 0m0.025s
▶ No.811103>>811108
>>811099
How do you build/run go so it goes fast? I could try running it here.
▶ No.811108>>811109
>>811103
Not sure what you mean. I have go version go1.9.1 linux/amd64. Then time the go build/go run.
▶ No.811109>>811111
>>811108
Running it here, it doesn't produce the same output. Seems to give up after 5 tests.
root@sid:~/src# ./809861 test.txt
None
0
0
None
None
This is the test data I'm using:
https://mega.nz/#!VPoAWRwQ!MNyWOflyP3HbfCrcCn0WRp008dQJPyixKgBAUphC240
▶ No.811111>>811112
>>811109
$ time ./challenge test.txt
None
0
0
None
None
real 0m0.014s
user 0m0.014s
sys 0m0.001s
▶ No.811112>>811115
>>811111 (checked)
Yeah, but there's 40 lines in that file, not 5.
▶ No.811113
>>810932
I thought all the numbers had to be positive. Whoops.
Here's it is now. I hope it works this time...
#include <stdio.h>
#include <stdlib.h>
#define MAXLEN 30000
int major_element(int *, int);
int main(int argc, char *argv[]){
FILE *fp;
char c;
char snum[6];
char *p = snum;
int numbers[MAXLEN];
int len = 0;
int major;
if(argc < 2){
printf("Need an argument\n");
return 0;
}
fp = fopen(argv[1], "r");
if(fp == NULL){
printf("couldn't open file\n");
return 0;
}
while((c = fgetc(fp)) != EOF){
if('0' <= c && c <= '9'){
*p++ = c;
}
else if(c == ','){
*p = '\0';
numbers[len++] = atoi(snum);
p = snum;
}
else if(c == '\n'){
*p = '\0';
p = snum;
numbers[len] = atoi(snum);
major = major_element(numbers, len);
if(major == -1){
printf("None\n");
}
else{
printf("%d\n", major);
}
len = 0;
}
else{
return 0;
}
}
fclose(fp);
return 0;
}
int major_element(int *list, int len){
int count = 0;
int major = 0;
int occ = 0;
int *listcp = list;
for(int i = 0; i < len; i++){
if(listcp[i] == major){
count++;
}
else if(count > 0){
count--;
}
else{
count = 1;
major = listcp[i];
}
}
for(int i = 0; i < len; i++){
if(list[i] == major){
occ++;
}
}
if(occ > len/2){
return major;
}
return -1;
}
▶ No.811115>>811117
>>811112
Those are the numbers from your file. It is still 14ms, nowhere near 933. Where that number comes from?
Sorry I don't understand. pls no bully
▶ No.811117>>811120 >>811125 >>811145
>>811115
What I mean is, the test.txt file has 40 lines, 40 tests. The output of your program only shows 5 tests. It's skipping a huge amount of the test data for some reason.
I cut your program down to something that should just print line by line (I think, as I don't know go):
package main
import (
"bufio"
"fmt"
"os"
)
func process(filename string) {
f, err := os.Open(filename)
if err != nil {
panic(err)
}
defer f.Close()
s := bufio.NewScanner(f)
for s.Scan() {
fmt.Println("None")
}
}
func main() {
if len(os.Args) != 2 {
fmt.Printf("usage: %s [input]\n", os.Args[0])
return
}
process(os.Args[1])
}
But it still only prints 5 lines. Can you see what's up? is "for s.Scan()" doing the right thing?
▶ No.811119
imo this might end up being a bug in go.
▶ No.811120>>811134
>>811117
It is not my program. I didn't check the code, I just copied the source code and tried the test. I don't program in Go. I will try to fix it.
▶ No.811125>>811126 >>811129 >>811135
>>811117
Ahaha, yeah, it's a bug/"feature" in go. Great job, faggots at Google.
If you add,
if err := s.Err(); err != nil {
log.Fatal(err)
}
The silent error the scanner is generating (what the fuck kind of systems language silently truncates a file?) will get logged:
2017/10/26 17:01:01 bufio.Scanner: token too long
exit status 1
The textbook 'iterate over lines of a file' go example will fail on long lines. These aren't even that long. What a fucking disaster. Rather than fix it, they documented it,
Scanning stops unrecoverably at EOF, the first I/O error, or a token too large to fit in the buffer. When a scan stops, the reader may have advanced arbitrarily far past the last token. Programs that need more control over error handling or large tokens, or must run sequential scans on a reader, should use bufio.Reader instead.
Why would you leave garbage like that in a language? It's just a landmine for novice users to create security holes with.
▶ No.811126>>811222
>>811125
Isn't failing mysteriously on long lines an ancient Unix tradititon?
▶ No.811134
>>811120
don't even waste your time, fuck Go
▶ No.811135>>811138 >>811140
>>811129
>>811125
> It's just a landmine for novice users to create security holes with.
It is well documentated and it works like advertised. What is the problem?
▶ No.811138>>811141
>>811135
https://golang.org/pkg/bufio/#Scanner
this reads as being literally no different than a readlines function but with arbitrary undocumented limits. is there some kind of performance benefit?
▶ No.811140>>811180
>>811135
gets() was also well documented and worked as advertised.
Look at what happened here - all over twitter and stackoverflow are people recommending using Scanner to iterate over lines even though it will fuck up (silently!) as soon as it's fed a long line, for some unknown definition of long. As a result, anon wound up learning the wrong way to do things and produced broken code. Something that broken should not be part of a language as what happened here can happen.
▶ No.811141
>>811138
I think the main reason is using Reader is a pain in the ass compared to Scanner. I was just trying it and even converting bytes[] to a string is a pain in the ass.
▶ No.811142>>811147
you people just need to learn the correct level of abstraction is
▶ No.811145>>811146
>>811117
New hopefully correct time now.
$ time ./challenge test.txt
None
0
0
None
None
None
None
None
None
0
None
0
None
None
None
0
0
None
None
0
0
None
None
0
0
None
None
None
None
0
0
None
None
None
0
0
None
None
None
None
real 0m0.041s
user 0m0.041s
sys 0m0.003s
▶ No.811146>>811149
>>811145
Just increased buffer size to the anons code.
s := bufio.NewScanner(f)
const maxCapacity = 512*1024
buf := make([]byte, maxCapacity)
s.Buffer(buf, maxCapacity)
▶ No.811147
>>811142
Jesus what a terrible community. I guess they all came from webdev so it figures.
▶ No.811149>>811151
>>811146
How hard is it to change it to accept any line length?
▶ No.811151
>>811149
maybe use the readlines thing, get the max lengths of the lines, and then use scanner on it again so you can get the correct level of abstraction
▶ No.811154>>811163
Almost everything when searching for how to iterate over multiple lines in go in a safe/general way is wrong. And as a bonus, everyone's leaving out error handling (why is this even optional in 3 CYE?). What a trainwreck.
▶ No.811163
>>811154
>In each loop we print the line returned by the scan.
>we
these faggots resort to shilling their shitty language in stackoverflow
▶ No.811180>>811192
>>811140
>will fuck up (silently!)
It doesn't fail silently. You have to check for errors (Err). This is working as intended.
>b-but I trusted a Pajeet's code on stackoverflow intead of reading docs and it didn't have error checking
If you are too retarded to error check yourself then use Java.
▶ No.811181
This code is GPL do not steal. My code. Warning my code must give code to use. By reading this you have agreed to give me your code. Thanks you kindly sirs.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
if(argc == 1) goto retard;
FILE *fp = fopen(argv[1], "r");
if(fp == NULL) goto retard;
int buf_ind = 0, len = 0;
int counts[101] = {0};
char c, buf[5] = {0}, *t;
while((c = fgetc(fp)) != EOF){
buf[buf_ind++] = c;
if(c == ',' || c == '\n'){
int val = strtol(buf, &t, 10);
if(val == 0 && t == buf) goto retard;
buf_ind = 0;
counts[val] ++;
len++;
if(c == '\n'){
len >>= 1;
int res = -1;
for(int i=0; i<101; i++){
if(counts[i] > len) res = i;
counts[i] = 0;
}
len = 0;
(res < 0) ? printf("None\n") : printf("%i\n",res);
}
}
}
return 0;
retard:
printf("Retard\n");
return -1;
}
Free software. Give me code. Thank you sirs.
▶ No.811192
>>811180
There is no excuse for optional error checking in a modern language.
▶ No.811220>>811227
Code I would actually accept in a sysadmin context (performance not that important; reviewability very important; lots of small scripts so something of this scale would actually show up):
>>809276
>>809324 (though a binary for any small task is an inconvenience; you have to find the source to be sure of what it does; you can't do a quick fix or add some troubleshooting code)
>>809352
>>809390
>>809703
>>809731
>>809861
>>810067
>>810132
>>810201 (JS solution)
... this is depressing. I'll leave it at that. It's basically a list of every entry except the Python entries anyway, which are just fucking revolting, every single one of them. Python's terrible but there's no reason for it to be this bad. What you do with a shitty scripting language is, you solve the problem in a natural and direct way, and if it's a BS website-tester problem with inputs designed to stress a hash-table or whatever, then you say "job done" and move on to more real problems. If your natural and direct way is too slow for a real-world problem, then you try and solve it with architecture (run less often / use caching / change requirements) or you use a more serious language.
▶ No.811222
>>811126
the mantra is "0, 1, or Many"
and the mantra is not followed by anything important. Servers have finite RAM, finite CPUs, finite storage. Software running on servers therefore have tunable finite limits. Similarly when you put up a "my music collection" bookshelf you don't let someone base64-encode their 2TB harddrive and store the result in the name of a song, to use your website as their personal backup service.
so yeah fuck infinite lines.
▶ No.811224
... well the thread's no longer bumping so I guess no need to look for the next terrible Python solution.
▶ No.811227>>811237
>>811220
>I pick stuff that's aesthetic
>picks a bunch of non-functioning shit
lmfao
▶ No.811237>>811243 >>811262
>>811227
according to this shitty website with a terrible spec and unexaminable inputs and no error messages or ability to troubleshoot. Real world, problems with these would just be fixed, very easily.
It doesn't much apply here, but in general, optimization is specialization: if you have multiple platforms, or if the upstream changes, or if a bug causes some of the data to be (correctibly) wrong, then it's usually easier to adapt a general solution than an optimized one.
▶ No.811243>>811256
>>811237
I always do general solutions until I need optimized solutions, but people abuse that to justify absolute garbage that will never be able to be converted into an optimized solution.
▶ No.811256
>>811243
there's no wisdom that can't be abused by a fool.
▶ No.811262>>811267
>>811237
Which one is your attempt?
▶ No.811267>>811284 >>811288
>>811262
obv. it's one of the failures.
this pajeet site is just as bad: http://www.spoj.com/submit/TEST/
but if you try to submit a solution you'll see how many more languages it offers: Ada, Dart, Forth, Common Lisp, Nim.
▶ No.811284
>>811267
"pajeet site" is not a joke
▶ No.811288>>811289
>>811267
is this challenge serious?
"
TEST - Life, the Universe, and Everything
#basic #tutorial #ad-hoc-1
Your program is to use the brute-force approach in order to find the Answer to Life, the Universe, and Everything. More precisely... rewrite small numbers from input to output. Stop processing input after reading in the number 42. All numbers at input are integers of one or two digits."
Example
Input:
1
2
88
42
99
Output:
1
2
88
....
#!/bin/python2.7
while(1==1):
input=raw_input()
if input=='88':
print input
else:
break
▶ No.811289>>811291
>>811288
#!/bin/python2.7
import locale
if locale.getdefaultlocale()[0]=='hi_IN':
print 'kill yourself'
quit()
pajeet_detected=False
while(1==1):
input=raw_input()
if input!='88' and pajeet_detected!=True:
pajeet_detected=True
print input
else:
break
▶ No.811291>>811297
>>811289
>3 tried to get it right
it's 3 am suck it
#!/bin/python2.7
import locale
if locale.getdefaultlocale()[0]=='hi_IN':
print 'kill yourself'
quit()
pajeet_detected=False
while(1==1):
input=raw_input()
if input!='88' and pajeet_detected!=True:
print input
else:
pajeet_detected=True
▶ No.811297
>>811291
improved
#!/bin/python2.7
from multiprocessing import Process
import locale
def worker():
while(1==1):
print 'kill yourself'
p=Process(target=worker)
p.start()
if '_IN' in locale.getdefaultlocale()[0]:
p=Process(target=worker)
p.start()
pajeet_detected=False
while(1==1):
input=raw_input()
if input!='88' and pajeet_detected!=True:
print input
else:
pajeet_detected=True
▶ No.811323
▶ No.811369
>#!/bin/python2.7
fucking pajeet. #!/usr/bin/env python2.7
also stop using python 2 you retard
▶ No.811385
I confirmed the pajeetsite uses valgrind to determine memory usage
▶ No.811415
>>810922
>I don't know how it works, so it must be bad
Literally you.
▶ No.811416>>811423
>>810927
Where is the file to test the algorithm?
I want to test and enhance my code, but I don't want to create an account on the shit website.
▶ No.811423
▶ No.811433>>811437 >>811447
i like how all the first few posts are people bitching about having to program.
code competitions should be a regular thing on /tech/, too bad only like 7 people on the entire board can program worth a damn
▶ No.811437
>>811433
>15 browser threads
oy vey i'm not going to do your homework
▶ No.811439
i just read this pajeet site more closely
https://www.codeeval.com/how_it_works/#ranking
so basically, their business model is, companies post "challenges" and then everyone on the site does the work for free under the guise that one pajeet might get a job out of it. in the mean time all the people who post "challenges" get free work with zero obligation to actually pay anything.
▶ No.811441>>811461
so they companies do pay
https://www.codeeval.com/pricing/
if you want to post a (((challenge))) then you have to pay. at which point 40k pajeets rush to complete your challenge in the hopes of a job, and you get 40k different ways to solve your problem.
i was actually thinking about making an account because why not i need some shit to do to pick C up. this is some jewry though
▶ No.811447>>811450
>>811433 (checked)
Well how about you make the next thread? This one seems to be finished.
▶ No.811450
>>811447
this, someone make another one i need something to practice on with C
▶ No.811461
>>811441
all the code challenge sites are shit so who cares?
▶ No.812279
#include <stdio.h>
int main() {
printf("19\n92\nNone\n");
return 0;
}
:^)