Bimillennial coding challenge

Name
Email
Subject
Comment *
File	Select/drop/paste files here
* = required field	[▶ Show post options & limits] Confused? See the FAQ.

Flag
Oekaki	Show oekaki applet (replaces files and can be used instead)
Options	Do not bump (you can also write sage in the email field) Spoiler images (this replaces the thumbnails of your images with question marks)
Password	(For file and post deletion.)
Allowed file types:jpg, jpeg, gif, png, webm, mp4, pdf Max filesize is 16 MB. Max image dimensions are 15000 x 15000. You may upload 3 per post.

[–]

▶Bimillennial coding challenge Anonymous 10/22/17 (Sun) 12:44:58 No.808650>>808718 >>808788 >>809600 >>809861 >>811008 [Watch Thread][Show All Posts]

Last thread introduced a lot of bickering and anger, let's see if we can replicate that.

Rules

Commit your solution to the challenge in this thread. (Solutions written by women and solutions written in Rust receive a 10% bonus when being rated)

Challenge

The major element in a sequence with the length of L is the element which appears in a sequence more than L/2 times. The challenge is to find that element in a sequence.

Input sample:


92,19,19,76,19,21,19,85,19,19,19,94,19,19,22,67,83,19,19,54,59,1,19,19
92,11,30,92,1,11,92,38,92,92,43,92,92,51,92,36,97,92,92,92,43,22,84,92,92
4,79,89,98,48,42,39,79,55,70,21,39,98,16,96,2,10,24,14,47,0,50,95,20,95,48,50,12,42

Your program should accept as its first argument a path to a filename. Each line of the file contains a sequence of integers N separated by comma. E.g.

Output sample:


19
92
None

For each sequence print out the major element or print "None" in case there is no such element. E.g.

Constraints:

N is in range [0, 100]

L is in range [10000, 30000]

The number of test cases <= 40

▶Anonymous 10/22/17 (Sun) 12:48:40 No.808652>>808655

I think there should be a week waiting time before anyone is allowed to reply so OP has to do his homework himself first.

▶Anonymous 10/22/17 (Sun) 12:51:52 No.808653>>808655 >>808794

1. sort list

2. begin looping through items starting at beginning of list

3. if the current element's value is equal to the value at the current index + L/2, return that element

4. keep going until that value is found, or reached L/2

▶Anonymous 10/22/17 (Sun) 13:08:45 No.808655>>808660 >>811008

>>808652

Not my homework, nice try.

>>808653

No pesudocode allowed.

▶Anonymous 10/22/17 (Sun) 13:19:15 No.808660>>811008

>>808655

op copied the problem from https://www.codeeval.com/browse/132/

he wants us to come up with an original answer for him to put on his profile

▶Anonymous 10/22/17 (Sun) 16:43:37 No.808712>>808715

Done. It could be neater, and I've taken some liberties with the output so that you can't use it as homework, but it works.

Save as homework.perl and use it like so:


perl homework.perl "92,19,19,76,19,21,19,85,19,19,19,94,19,19,22,67,83,19,19,54,59,1,19,19"
>19

perl homework.perl "92,11,30,92,1,11,92,38,92,92,43,92,92,51,92,36,97,92,92,92,43,22,84,92,92"
>92

perl homework.perl "4,79,89,98,48,42,39,79,55,70,21,39,98,16,96,2,10,24,14,47,0,50,95,20,95,48,50,12,42"
>None

perl homework.perl
>Crashes because I'm too lazy to add an alternative.

https://hastebin.com/ribenivili.pl

▶Anonymous 10/22/17 (Sun) 17:00:31 No.808715>>808719 >>809087

>>808712

>perl

▶Anonymous 10/22/17 (Sun) 17:09:03 No.808718

>>808650 (OP)

>women and rust get a 10% bonus

Your 'challenge' gets a 110% penalty for this cancer, OP. Hang your head in shame.

▶Anonymous 10/22/17 (Sun) 17:10:00 No.808719

>>808715

OP gets the code he deserves in the language he deserves.

At least he would, but I don't know Rust.

▶Anonymous 10/22/17 (Sun) 17:10:42 No.808720

I'm a black Jewish trans M2F otherkin who identifies as lesbian. I felt unsafe coding in Rust which is a toxic masculine name. Coded in Mathematica because it's proprietary.


majorElement[lst_] :=
 SelectFirst[
   Tally[lst], #[[2]] > Length[lst]/2 &] /.
  {Missing[_] -> "None",
   {x_, c_} -> x}

▶Anonymous 10/22/17 (Sun) 17:11:56 No.808721

File (hide): 4e2e356c6a7134a⋯.jpeg (52 KB, 900x810, 10:9, 12649829843.jpeg) (h) (u)

lmao @people thinking anyone would come to /tech/ for homework

just post it on cuckoverflow where people routinely solve homework for some upboats

▶Anonymous 10/22/17 (Sun) 17:37:26 No.808728>>808731

Use a dictionary

O(n), can't do it faster I believe.

▶Anonymous 10/22/17 (Sun) 17:57:49 No.808731>>808732

>>808728

It can be done in O(1) time you dumb nigger.

▶Anonymous 10/22/17 (Sun) 17:58:35 No.808732>>808734

>>808731

How? **You dumb nigger*

▶Anonymous 10/22/17 (Sun) 18:07:15 No.808734>>808735

>>808732

O(k) is asymptotically equivalent to O(1) for any constant k you double nigger.

▶Anonymous 10/22/17 (Sun) 18:28:02 No.808735>>808749

>>808734

O(L) then. Happy faggot? Learn standard notation, n is the length of the sequence

▶Anonymous 10/22/17 (Sun) 19:04:54 No.808749>>808753

>>808735

Please give me a value of n where O(n) wouldn't be a constant.

▶Anonymous 10/22/17 (Sun) 19:11:35 No.808753>>808757 >>808761 >>808770

>>808749

n=a variable=the length of the sequence

You're insane. Now show me how to do it in constant time rather than linear, faggot

▶Anonymous 10/22/17 (Sun) 19:13:32 No.808757

>>808753

no you double faggot

▶Anonymous 10/22/17 (Sun) 19:19:06 No.808761

>>808753

Impossible. Only alternative to O(n) would be probabilistic (Monte Carlo) methods up to some desired confidence interval.

▶Anonymous 10/22/17 (Sun) 19:28:00 No.808770

>>808753

>n=a variable=the length of the sequence

No, nigger, that's L.

>show me how to do it in constant time

It's trivial. Do you even know what constant time means?

▶Anonymous 10/22/17 (Sun) 19:33:46 No.808771

jesus christ stop taking such obvious bait

▶Anonymous 10/22/17 (Sun) 20:20:16 No.808788>>808791 >>808795

>>808650 (OP)

>N is in range [0, 100]

this is retarded just create an array of 100 counters and record the largest one.

That's it, now do your fucking homework.

▶Anonymous 10/22/17 (Sun) 20:28:32 No.808791>>809453

>>808788

> 0 to 100

> create 100 counters

You don’t deserve those dubs.

▶Anonymous 10/22/17 (Sun) 20:34:04 No.808794

I can do it in O(\alpha n):

Get a hash table where putting shit in each bin is O(\alpha)

Put every item there.

I assume the hash table has a "majorest element" and "amount of times it appears" that gets updated when you put the element in each bin (which stores how many times the element appeared).

Then take the majorest element of the hash table.

My solution is "faster" than >>808653 because I don't do the nlog(n) bit. All you have to do is control \alpha and you're golden (most times \alpha won't go over a constant). Just have a good hash function that doesn't put everything in the same bin.

▶Anonymous 10/22/17 (Sun) 20:35:05 No.808795>>808801

>>808788

Right. That there is the hash function. Then my answer is O(L).

▶Anonymous 10/22/17 (Sun) 20:49:16 No.808801>>808802

>>808795

It's O(n) you dumb nigger.

▶Anonymous 10/22/17 (Sun) 20:53:29 No.808802

>>808801

>can't do it in constant time

>calls others dumb niggers

▶Anonymous 10/22/17 (Sun) 21:03:35 No.808805>>808808

function major(L) 
  local M = {}
  local max_n = nil
  local max_count = 0
  local length = 0
  for n in L do
    M[n] = (M[n] or 0) + 1
    length = length + 1
  end
  for n,count in pairs(M) do
    if count > max_count then
      max_n = n
      max_count = count
    end
  end
  if max_count > length/2 then
    return max_n
  end
end

for line in io.lines(arg[1]) do
  print(major(string.gmatch(line, "%d+")) or "None")
end

Usage:

$ cat data
92,19,19,76,19,21,19,85,19,19,19,94,19,19,22,67,83,19,19,54,59,1,19,19
92,11,30,92,1,11,92,38,92,92,43,92,92,51,92,36,97,92,92,92,43,22,84,92,92
4,79,89,98,48,42,39,79,55,70,21,39,98,16,96,2,10,24,14,47,0,50,95,20,95,48,50,12,42
$ luajit shit.lua data 
19
92
None

▶Anonymous 10/22/17 (Sun) 21:10:24 No.808808

File (hide): a77f83a087144a6⋯.png (321.29 KB, 870x583, 870:583, smells better than new del….png) (h) (u)

>>808805

>helping pajeet get hired for a job he isn't remotely qualified for

▶Anonymous 10/23/17 (Mon) 00:16:09 No.808872>>808894 >>809100 >>809238 >>809255 >>809266 >>809648

Gotta go fast, faggots. Also, I identify as I woman.


#include <unistd.h>
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define MAX_L 30000
#define SEPARATOR_LENGTH 1
#define EOL_LENGTH 1
#define MAX_N_LENGTH 1
#define MAX_CASES 40
#define MAX_LINE_LENGTH ((SEPARATOR_LENGTH * (MAX_L - 1)) + (MAX_N_LENGTH * MAX_L) + EOL_LENGTH)
#define MAX_FILE_LENGTH MAX_LINE_LENGTH * MAX_CASES

char buffer[MAX_FILE_LENGTH + 1];
uint_fast64_t counts[100];
uint_fast64_t const tens[] = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90};

inline uint_fast64_t parse_n(char * restrict * restrict c) {
  uint_fast64_t retval = 0;

  char *begin = *c;
  while(**c >= '0' && **c <= '9') {
    ++*c;
  }

  int len = *c - begin;

  switch(len) {
    case 3: retval = 100; break;
    case 2: retval = tens[begin[len - 2] - '0'];
    case 1: retval += begin[len - 1] - '0';
  }

  return retval;
}

int main(int argc, char *argv[]) {
  uint_fast64_t n;
  uint_fast16_t l;
  int i;

  int fd = open(argv[1], O_RDONLY);
  (void) read(fd, buffer, MAX_FILE_LENGTH);

  char *c = buffer;
  goto start;
  for(; *c; ++c) {
    (void) memset(counts, 0, sizeof(counts));
start:
    for(l = 1; ; ++l, ++c) {
      n = parse_n(&c);
      ++counts[n];
      if(*c == '\n') break;
    }
    uint_fast16_t half = l >> 1;
    for(i = 0; i < 100; ++i) {
      if(counts[i] > half) {
        printf("%d\n", i);
        break;
      }
    }
    if(i == 100) printf("None\n");
  }
}

It could go faster with pthreads but it's just extra bulk for what would be a pretty boring split of a buffer across threads.

▶Anonymous 10/23/17 (Mon) 01:00:20 No.808894>>808911

>>808872

You need 101 bins, faggot.

▶Anonymous 10/23/17 (Mon) 01:20:09 No.808911>>808928

>>808894

No, I identify as a woman, I can get away with it.

▶Anonymous 10/23/17 (Mon) 01:46:10 No.808928

>>808911

I am so sorry. Will educate myself to be aware of my own privilege. I will learn from this teaching experience.

▶Anonymous 10/23/17 (Mon) 08:03:12 No.809087>>809089

>>808715

Perl beautiful language

perl -nE 'chomp;@a=split/,/;%c;$c{$_}++for@a;say((grep{$c{$_}>=~~@a/2}@a)[0]||None)' cocks

I'm not gay

▶Anonymous 10/23/17 (Mon) 08:04:57 No.809089

>>809087

Post the source not the binary, dumbass.

▶Anonymous 10/23/17 (Mon) 08:22:47 No.809100>>809107

>>808872

[code]

> goto start;

> for(; *c; ++c) {

> (void) memset(counts, 0, sizeof(counts));

>start:

What is going on here?

▶Anonymous 10/23/17 (Mon) 08:34:44 No.809107>>809111

>>809100

I'm skipping the first initialization since it's guaranteed to start zeroed. Jumping midway into a loop on the first iteration is generally a cleaner way to deal with that.

▶Anonymous 10/23/17 (Mon) 08:39:10 No.809111

>>809107

Argh, I need to sleep, I swear that { wasn't there when I first looked.

▶Anonymous 10/23/17 (Mon) 10:30:59 No.809140>>809141

Bump. Do something for once, /tech/.

▶Anonymous 10/23/17 (Mon) 10:34:00 No.809141

>>809140

Even 4/g/ is better at programming than /tech/ lmao

▶Anonymous 10/23/17 (Mon) 14:42:25 No.809232>>809255

File (hide): 2f068844e9328d9⋯.jpeg (59.79 KB, 860x484, 215:121, serveimage.jpeg) (h) (u)


import argparse
from multiprocessing import Pool

def findMajors(sequence):
	sequence=sequence.split(',')
	count_list=[]
	for integer in sequence:
		count=1
		for search in sequence:
			if integer==search:
				count+=1
		count_list.append(count)
	major_index=[]
	index=0
	for count in count_list:
		if count>len(sequence)/2:
			if index not in major_index:
				major_index.append(index)
		index+=1
	major_integers=[]
	for major in major_index:
		if sequence[major] not in major_integers:
			major_integers.append(sequence[major])
	return major_integers	

def worker_process(sequence):
	major_integers=findMajors(sequence)
	if major_integers==['']:
		print 'None'
	else:
		for major in major_integers:
			print major

def main():
	parser=argparse.ArgumentParser()
	parser.add_argument('filename',type=str)
	args=parser.parse_args()
	try:
		sequences=open(args.filename,'r').read().split('\n')
	except:
		print 'None'

	p=Pool(len(sequences))
	p.map(worker_process,sequences)

if __name__=="__main__":
	main()

▶Anonymous 10/23/17 (Mon) 14:57:30 No.809238>>809245

>>808872


g++ challenge.c -o a.out

challenge.c:21:46: error: expected ‘,’ or ‘...’ before ‘*’ token
 inline uint_fast64_t parse_n(char * restrict * restrict c) {
                                              ^
challenge.c: In function ‘uint_fast64_t parse_n(char*)’:
challenge.c:24:18: error: ‘c’ was not declared in this scope
   char *begin = *c;
                  ^
challenge.c: In function ‘int main(int, char**)’:
challenge.c:54:21: error: cannot convert ‘char**’ to ‘char*’ for argument ‘1’ to ‘uint_fast64_t parse_n(char*)’
       n = parse_n(&c);

▶Anonymous 10/23/17 (Mon) 15:05:06 No.809245>>809251

>>809238

>g++

kys

▶Anonymous 10/23/17 (Mon) 15:17:17 No.809251>>809252 >>809253

>>809245


gcc challenge.c
In function `main':
challenge.c:(.text+0x87): undefined reference to `parse_n'
collect2: error: ld returned 1 exit status

▶Anonymous 10/23/17 (Mon) 15:19:46 No.809252>>809253

>>809251

remove the 'inline' from the start of parse_n's definition.

▶Anonymous 10/23/17 (Mon) 15:20:12 No.809253>>809259

>>809251

Why are you even on this board if you don't know shit about technology?

>$ gcc anon.c -std=c99 -O2

>>809252

No retard.

▶Anonymous 10/23/17 (Mon) 15:21:23 No.809255>>809770

File (hide): 9d08b2524d1807f⋯.png (9.84 KB, 544x216, 68:27, scores.png) (h) (u)

>>808872

Amazing, doubt anyone can compete with this

>>809232

Broken and busted

▶Anonymous 10/23/17 (Mon) 15:31:41 No.809259

>>809253

>he doesn't use C89

hipster.

▶Anonymous 10/23/17 (Mon) 15:33:05 No.809260

[[$]] major_search.

$<[[containers]]::*.
$<[[iterators]]::while,break.
$<[[stdio]]::[File printf].
$<[[string]]::String.

$::main<%({argc: i32 argv: u8*[]} i32 ->
  argc!=2? $::printf{"Usage: %s [file]" argv[0]}.

  list_count := 0.
  new counts: $::HashMap(i32 i32).
  f := $::File::open argv[1].
  f::seek 0.
  str := $::String::new{f::read f::tell}.
  f::close!.

  $::while %(bool -> !str::empty!) %(->
    pos := str::find','.
    !pos? $::break!.

    num: i32 = str::substr{0, pos}::to_i32!.
    !counts::contains{num}?
      counts::set{num 1} :
      counts::set{num counts::get{num}+1}.
    str::shift pos+1.
    list_count++.
  ).

  counts::map %({key: i32 val: i32} ->
    val >= list_count/2? (
      $::printf{"%d\n" key}. $=0.
    ).
  ).

  $::printf"None\n".
  0.
).

▶Anonymous 10/23/17 (Mon) 15:42:12 No.809266

File (hide): dfba5a0704d7bfe⋯.gif (1.99 MB, 320x240, 4:3, dfba5a0704d7bfe4ca1d6bf9ec….gif) (h) (u)

>>808872

>that switch

▶Anonymous 10/23/17 (Mon) 16:30:37 No.809276>>809288 >>809370 >>810399 >>811220

File (hide): cee52dbf1654ab8⋯.png (118.48 KB, 622x436, 311:218, 2017-10-23--1508775904_192….png) (h) (u)


#!/usr/bin/env python3


import sys
from collections import Counter


def sequences(filename):
    with open(filename, "r") as f:
        for line in f:
            yield [int(i) for i in line.split(",")]


def major(sequence):
    if not sequence:
        return None

    element, count = Counter(sequence).most_common()[0]

    if count <= (len(sequence) / 2):
        return None

    return element


def main():
    for sequence in sequences(sys.argv[1]):
        print(major(sequence))


if __name__ == "__main__":
    main()

▶Anonymous 10/23/17 (Mon) 16:42:38 No.809279>>809285 >>809307

>retards actually doing this faggots homework

lmao

▶Anonymous 10/23/17 (Mon) 16:56:38 No.809285

>>809279

I encourage retards to get by through stealing homework, they'll inevitably fail their exams.

▶Anonymous 10/23/17 (Mon) 17:00:36 No.809288>>809377

>>809276

gib terminal configuration

▶Anonymous 10/23/17 (Mon) 17:30:07 No.809298>>809569 >>809578 >>809592


#!/bin/python2.7
import argparse
import sqlite3
from multiprocessing import Pool

def dowork(sequence):
	conn=sqlite3.connect(':memory:')
	cursor=conn.cursor()

	seq_list=sequence.split(',')
	target_length=(len(seq_list)/2)

	cursor.execute ('CREATE TABLE IF NOT EXISTS challenge (integers integer)')
	for integer in seq_list:
		cursor.execute('INSERT INTO challenge VALUES (?)',(int(integer),) )
		conn.commit()
	cursor.execute('SELECT integers, COUNT(*) FROM challenge GROUP BY integers HAVING COUNT(*) > (?)',(target_length,))
		
	fetch=cursor.fetchone()
	if fetch!=None:
		return fetch[0]
	else:
		return 'None'

	cursor.execute('DROP TABLE challenge')
	conn.commit()
	conn.close()
	
def main():
	parser=argparse.ArgumentParser()
	parser.add_argument('filename',type=str)
	args=parser.parse_args()
	try:
		sequences=open(args.filename,'r').read().rstrip().split('\n')
	except:
		print 'None'

	p=Pool(8)
	majors=p.map(dowork,sequences)
	for major in majors:
		print major

if __name__=="__main__":
	main()

▶Anonymous 10/23/17 (Mon) 17:38:10 No.809307>>809313 >>809317 >>809319 >>809329 >>809705

File (hide): 46f17866bc1956c⋯.webm (10.61 MB, 896x504, 16:9, ex-rust-developer-intervi….webm) (h) (u) [play once] [loop]

>>809279

>tfw someone actually gets assigned this retarded exercises

Am I the only Europoor CS student here? We barely see code, here, we mostly focus on the logical/mathematical aspect of things.

▶Anonymous 10/23/17 (Mon) 17:45:35 No.809313

>>809307

Which is why you'll be able to use ANY language when you graduate.

▶Anonymous 10/23/17 (Mon) 17:58:44 No.809316

Wow, you're making me realize just how much I suck a programming OP.

I tried doing this in C and I'm struggling to parse the fucking input.

I guess this is what happens when you only do algorithms (which is what you should be learning in school).

Also here's a hint to the people in this thread.

If you're using a hash table or any kind of bucket you're probably doing it wrong.

▶Anonymous 10/23/17 (Mon) 17:58:55 No.809317

>>809307

Nope, "computer science" was considered a branch of mathematics when I went to school.

▶Anonymous 10/23/17 (Mon) 18:01:22 No.809319

>>809307

Europoor here, we did Java in the first year and Assembly, Haskell and Prolog in the second. Half of each year and the whole of year 3 got delegated to theory, and we picked our own programming languages to fuck around with. The problem is that we all picked different languages to use primarily, and there were a lot of arguments on Python2 vs 3, PHP vs Perl and Java vs C#.

Theory is great, but you have to get your hands dirty with code too.

▶Anonymous 10/23/17 (Mon) 18:13:32 No.809324>>809390 >>809397 >>809494 >>811220

use std::collections::HashMap;
use std::fs::File;
use std::io::{BufRead, BufReader};

fn main() {
    let mut args = std::env::args_os().skip(1);
    assert_eq!(args.len(), 1, "exactly 1 arg you stupid nigger");

    let file = File::open(args.next().unwrap()).expect("i can't open this file you retarded fag");
    let reader = BufReader::new(file);
    let mut map = HashMap::new();

    for line in reader
        .lines()
        .map(|r| r.expect("your fucking shit isnt utf8")) {
        let mut count = 0;

        for i in line.split(',')
            .map(|s| s.parse::<u32>().expect("hitler dindu nuffin")) {
            *map.entry(i).or_insert(0) += 1;
            count += 1;
        }

        if let Some((k, _)) = map.drain()
            .filter(|&(_, v)| v >= count / 2)
            .max_by_key(|&(_, v)| v) {
            println!("{}", k);
        } else {
            println!("None");
        }
    }
}

▶Anonymous 10/23/17 (Mon) 18:33:26 No.809329

>>809307

>We barely see code, here, we mostly focus on the logical/mathematical aspect of things.

That's because Euro CS is dead. In the '90s, you used to be all code and would do class projects as an entire class led by the professor. It's why Euro CS grads were so good back then. Today, you have the standard post-modern Jewish CS education where you learn no CS and produce CS Grads.

▶Anonymous 10/23/17 (Mon) 19:32:37 No.809352>>809361 >>809370 >>811220

#include <stdio.h>
#include <stdlib.h>

#define L_MAX 30000

struct element {
	int value;
	int n;
};

struct element_list {
	struct element elements[L_MAX];
	size_t index;
	size_t max_index;
};

void
initElementList(struct element_list *elist)
{
	elist->index = 0;
	elist->max_index = 0;
}

int
findElementByValue(int value, struct element_list *elist)
{
	int i;

	for (i = 0; i < elist->max_index; i++) {
		if (elist->elements[i].value == value)
			return i;
	}

	return -1;
}

void
addValueToList(int value, struct element_list *elist) {
	int idx;

	if ((idx = findElementByValue(value, elist)) == -1) {
		elist->elements[elist->max_index].value = value;
		elist->elements[elist->max_index].n = 1;
		elist->max_index++;
	} else {
		elist->elements[idx].n++;
	}
}

int
findMajorElement(struct element_list *elist)
{
	int i;
	int max = 0;
	int max_index = 0;

	if (elist->max_index == 0) return -1;

	for (i = 0; i < elist->max_index; i++) {
		if (elist->elements[i].n > max) {
			max = elist->elements[i].n;
			max_index = i;
		}
	}

	if (max > (elist->max_index / 2)) {
		return elist->elements[max_index].value;
	}

	return -1;
}

int
main (int argc, char *argv[])
{
	FILE *fp;
	char c;
	char buf[3];
	size_t buf_idx = 0;
	struct element_list elist;
	int major;

	initElementList(&elist);

	if (argc == 1) {
		printf("No input file.\n");
	} else {
		while (--argc > 0) {
			if (!(fp = fopen(*++argv, "r"))) {
				printf("Can not open file.");
				exit(1);
			} else {
				while ((c = getc(fp)) != EOF) {
					if ((c >= '0') && (c <= '9')) {
						buf[buf_idx] = c;
						buf_idx++;
					} else if ((c == ',') || (c == '\n')) {
						if (buf_idx <= 2) buf[buf_idx] = '\0';
						buf_idx = 0;

						addValueToList(atoi(buf), &elist);

						if (c == '\n') {
							if ((major = findMajorElement(&elist)) != -1) {
								printf("%d\n", major);
							} else {
								printf("None\n");
							}

							initElementList(&elist);
						}
					} else if (c == ' ') {
						/* Ignore spaces. */
					} else {
						printf("Unexpected character %c.\n", c);
					}
				}

				fclose(fp);
			}
		}
	}

	return 0;
}

From 0 to pajeet how bad is my code?

▶Anonymous 10/23/17 (Mon) 20:08:25 No.809361

>>809352

>calling an array a list

>O(N) insertion to array when the problem requires an int n->int m map and n can only have 101 possible values

>mixed bracing

>function types on the previous line

>using a 3-char buf instead of replacing ',' and '\0' and reusing parts of the input string smh

>getc()'s return assigned to a char

>trying to printf("%c") the -1 you'll get on from getc() on EOF

0.2pajeet

I hate you

▶Anonymous 10/23/17 (Mon) 20:35:02 No.809370

File (hide): 8b5493b144727b0⋯.png (2.72 KB, 528x82, 264:41, snek lang.png) (h) (u)

>>809276

pic related

>>809352

Failed

▶Anonymous 10/23/17 (Mon) 20:47:56 No.809377

>>809288

Xresources


URxvt.perl-ext:
URxvt.perl-ext-common: selection-to-clipboard

#define termfont      xft:tewi:pixelsize=13:antialias=false

URxvt.font: termfont
URxvt.boldFont: termfont

URxvt.intensityStyles: false
URxvt.boldMode: false
URxvt.skipBuiltinGlyphs: true
URxvt.loginShell: true
URxvt*title: urxvt
URxvt*termName: rxvt-unicode
URxvt*urgentOnBell: true
URxvt.letterSpacing: 1

URxvt.borderless: true
URxvt.internalBorder: 20

URxvt.scrollBar: false
URxvt.scrollBar_right: false
URxvt.scrollstyle: plain
URxvt.jumpScroll: false

URxvt.scrollWithBuffer: true
URxvt.scrollTtyKeypress: true
URxvt.scrollTtyOutput: false

URxvt.buffered: true
URxvt.pointerBlank: true
URxvt.underlineURLs: true
URxvt.saveLines: 2048

URxvt.url-select.launcher: /opt/waterfox/waterfox-bin
URxvt.url-select.underline: true
URxvt.iso14755: false
URxvt.iso14755_52: false

Xft.antialias: true
Xft.autohint: false
Xft.dpi: 96
Xft.hinting: true
Xft.hitstyle: hintslight
Xft.lcdfilter: lcddefault
Xft.rgba: rgb

Xcursor.size: 15

URxvt.utf8: true
URxvt.locale: true

! X colors.
! Generated by 'wal'
emacs*foreground:   #BDBBD2
emacs*background:   #1D1C22
URxvt*foreground:   #BDBBD2
XTerm*foreground:   #BDBBD2
UXTerm*foreground:  #BDBBD2
URxvt*background:   [100]#1D1C22
XTerm*background:   #1D1C22
UXTerm*background:  #1D1C22
URxvt*cursorColor:  #BDBBD2
XTerm*cursorColor:  #BDBBD2
UXTerm*cursorColor: #BDBBD2
URxvt*borderColor:  [100]#1D1C22

! Colors 0-15.
*.color0: #1D1C22
*color0:  #1D1C22
*.color1: #588877
*color1:  #588877
*.color2: #606F8B
*color2:  #606F8B
*.color3: #D060AE
*color3:  #D060AE
*.color4: #19A494
*color4:  #19A494
*.color5: #609E9B
*color5:  #609E9B
*.color6: #43DBD4
*color6:  #43DBD4
*.color7: #BDBBD2
*color7:  #BDBBD2
*.color8: #77767a
*color8:  #77767a
*.color9: #588877
*color9:  #588877
*.color10: #606F8B
*color10:  #606F8B
*.color11: #D060AE
*color11:  #D060AE
*.color12: #19A494
*color12:  #19A494
*.color13: #609E9B
*color13:  #609E9B
*.color14: #43DBD4
*color14:  #43DBD4
*.color15: #BDBBD2
*color15:  #BDBBD2

! Black color that will not be affected by bold highlighting.
*.color66: #1D1C22
*color66:  #1D1C22

! Rofi colors.
rofi.color-window: #1D1C22, #1D1C22, #606F8B
rofi.color-normal: #1D1C22, #BDBBD2, #1D1C22, #606F8B, #1D1C22
rofi.color-active: #1D1C22, #BDBBD2, #1D1C22, #606F8B, #1D1C22
rofi.color-urgent: #1D1C22, #588877, #1D1C22, #588877, #BDBBD2

! Xclock colors.
XClock*foreground: #BDBBD2
XClock*background: #1D1C22
XClock*majorColor:  rgba:bd/bb/d2/ff
XClock*minorColor:  rgba:bd/bb/d2/ff
XClock*hourColor:   rgba:bd/bb/d2/ff
XClock*minuteColor: rgba:bd/bb/d2/ff
XClock*secondColor: rgba:bd/bb/d2/ff

! Set depth to make transparency work.
URxvt*depth: 32

font: https://github.com/Lucy/tewi-font

▶Anonymous 10/23/17 (Mon) 21:19:15 No.809390>>809400 >>809509 >>811220

>>809324

The same thing without the hashmap


use std::fs::File;
use std::io::{BufRead, BufReader};
use std::env::args_os;

fn main() {
    let filename = args_os().skip(1).next().expect(
        "exactly 1 arg you stupid nigger",
    );

    let reader = BufReader::new(File::open(filename).expect(
        "i can't open this file you retarded fag",
    ));

    for line in reader.lines().map(
        |r| r.expect("your fucking shit isnt utf8"),
    )
    {
        let mut buf: [u32; 101] = [0; 101];
        let mut cnt = 0;

        for n in line.split(',') {
            buf[n.parse::<usize>().expect("hitler dindu nuffin")] += 1;
            cnt += 1;
        }

        match buf.into_iter().position(|x| *x > cnt / 2) {
            Some(n) => println!("{}", n),
            _ => println!("None"),
        }
    }
}

▶Anonymous 10/23/17 (Mon) 21:45:17 No.809397

>>809324

Now we need the Ada version.

▶Anonymous 10/23/17 (Mon) 21:49:22 No.809400>>809404

>>809390

>expect("error message")

expect

the English word "expect"

used for that

<triggered

▶Anonymous 10/23/17 (Mon) 21:58:58 No.809404

File (hide): c76db1c59ab13eb⋯.jpg (384.3 KB, 1400x933, 1400:933, steve-profile.jpg) (h) (u)

>>809400

▶Anonymous 10/24/17 (Tue) 00:03:27 No.809453>>809475

>>808791

Alright, you win. 99 then.

▶Anonymous 10/24/17 (Tue) 02:01:52 No.809475>>809482

>>809453

101 my black friend.

▶Anonymous 10/24/17 (Tue) 02:13:16 No.809482>>809485

File (hide): a09eb9ff622c2d0⋯.jpeg (103.56 KB, 1000x563, 1000:563, wtf.jpeg) (h) (u)

>>809475

I bet you were the brightest kid in the short bus.

▶Anonymous 10/24/17 (Tue) 02:20:42 No.809485>>811089

>>809482

Thank me for bothering to correct you.

▶Anonymous 10/24/17 (Tue) 03:02:38 No.809494>>809537

File (hide): ed9dc7724ebfd33⋯.jpg (155.61 KB, 1536x1042, 768:521, ed9dc7724ebfd33cd5aef815cc….jpg) (h) (u)

>>809324

> let mut args = std::env::args_os().skip(1);

▶Anonymous 10/24/17 (Tue) 04:08:47 No.809509

>>809390

So how does it perform?

▶Anonymous 10/24/17 (Tue) 05:27:08 No.809537

>>809494

explain

▶Anonymous 10/24/17 (Tue) 06:53:52 No.809569>>809574

>>809298

i wonder why this isn't faster than it is. it's not much faster than my straight python solution, isn't the sqlite3 module for python C? All the work here should be done by sqlite3 but it's still slow.

▶Anonymous 10/24/17 (Tue) 06:59:06 No.809574

>>809569

Databases are ridiculously fucking slow, anon. And sqlite3 is slow for a database.

▶Anonymous 10/24/17 (Tue) 07:04:59 No.809578>>809582 >>809592

>>809298

There's up to 30,000 numbers in each test right? You're probably spending a lot of time inserting each row individually. Look into doing a bulk insert, or possibly grouping those inserts into a single transaction. Of course this is just conjecture, and you should profile to see where time is being spent.

▶Anonymous 10/24/17 (Tue) 07:10:13 No.809582>>809587 >>809591 >>809648 >>809687 >>809774 >>810378

>>809578

A script to create test data if anyone wants it.


#!/usr/bin/python
from random import random

for t in xrange(40):
    l = 10000 + int((30000 - 10000 + 1) * random())
    bias = 10 * random()
    result = []
    for i in xrange(l):
        n = int(random()**bias * (100 + 1))
        result.append(n)
    print ",".join([str(x) for x in result])

▶Anonymous 10/24/17 (Tue) 07:25:29 No.809587>>809591

>>809582

i was _just_ about to write one myself. thx

▶Anonymous 10/24/17 (Tue) 08:01:38 No.809591>>809593

>>809582

>>809587

same, already did too


import random

string=''
row_count=40
integer_count=50000

row_counter=0
while(row_counter<row_count):
	integer_counter=0
	while(integer_counter<integer_count):
		string+=str(random.randint(0,100))+','
		integer_counter+=1
	string=string[:-1]+'\n'
	row_counter+=1
string=string[:-1]

f=open('testdata','w')
f.write(string)
f.close()

▶Anonymous 10/24/17 (Tue) 08:04:36 No.809592>>809602 >>809666 >>809825 >>810094

>>809578

doing a batch insert with executemany instead of one by one improved performance by 500%.

testing on 40 rows with 50k integers each:

10-11s with

>>809298

2-2.4s with


#!/bin/python2.7
import argparse
import sqlite3
from multiprocessing import Pool

def dowork(sequence):
	conn=sqlite3.connect(':memory:')
	cursor=conn.cursor()

	seq_list=sequence.split(',')
	target_length=(len(seq_list)/2)

	seq_list_int=[]
	for integer_str in seq_list:
		seq_list_int.append( ( int(integer_str),) )

	cursor.execute ('CREATE TABLE challenge (integers integer)')
	cursor.executemany('INSERT INTO challenge VALUES (?)',seq_list_int)
	conn.commit()

	cursor.execute('''SELECT integers,max(rowid) FROM challenge 
		GROUP BY integers HAVING COUNT(1) > (?) LIMIT 1''',(target_length,))
		
	fetch=cursor.fetchone()
	if fetch!=None:
		return fetch[0]
	else:
		return 'None'

	cursor.execute('DROP TABLE challenge')
	conn.commit()
	conn.close()
	
def main():
	parser=argparse.ArgumentParser()
	parser.add_argument('filename',type=str)
	args=parser.parse_args()
	try:
		sequences=open(args.filename,'r').read().rstrip().split('\n')
	except:
		print 'None'

	p=Pool(8)
	majors=p.map(dowork,sequences)
	for major in majors:
		print major

if __name__=="__main__":
	main()

▶Anonymous 10/24/17 (Tue) 08:10:28 No.809593

>>809591

>integer_count=50000

>L is in range [10000, 30000]

wat

▶Anonymous 10/24/17 (Tue) 08:25:59 No.809600>>809606 >>809627

>>808650 (OP)

IMPORTANT: OP PLEASE SPECIFY

are you saying that there is only one element which appears more than L/2 times?

▶Anonymous 10/24/17 (Tue) 08:36:45 No.809602>>809605 >>809611 >>809666

>>809592

using list comprehensions to generate the tuple list being fed into the table is another 20% performance boost, now 1.8-2.0s instead of 2-2.4


def dowork(sequence):
	conn=sqlite3.connect(':memory:')
	cursor=conn.cursor()

	seq_list=sequence.split(',')
	seq_list=[(x,) for x in seq_list ]
	target_length=(len(seq_list)/2)

	cursor.execute ('CREATE TABLE challenge (integers integer)')
	cursor.executemany('INSERT INTO challenge VALUES (?)',seq_list)
	conn.commit()

	cursor.execute('''SELECT integers,max(rowid) FROM challenge 
		GROUP BY integers HAVING COUNT(1) > (?) LIMIT 1''',(target_length,))
		
	fetch=cursor.fetchone()
	if fetch!=None:
		return fetch[0]
	else:
		return 'None'

	cursor.execute('DROP TABLE challenge')
	conn.commit()
	conn.close()

▶Anonymous 10/24/17 (Tue) 08:42:38 No.809605

>>809602

I thought i screwed up here

seq_list=[(x,) for x in seq_list ]

should have been

seq_list=[(int(x),) for x in seq_list ]

but apparently it doesn't matter, sqlite will do the conversion if it can, which is probably part of the 20% performance increase.

▶Anonymous 10/24/17 (Tue) 08:42:58 No.809606>>809622 >>809627

>>809600

How can there be more than one, Pajeet?

▶Anonymous 10/24/17 (Tue) 08:53:13 No.809611>>809614

>>809602


	cursor.execute('DROP TABLE challenge')
	conn.commit()
	conn.close()

Those lines are unreachable, aren't they?

▶Anonymous 10/24/17 (Tue) 08:57:21 No.809614

>>809611

ya that's another fuckup, that should go above

if fetch!=None:

i don't know how necessary it even is, python probably closes it and garbage collects it anyway, in theory.

▶Anonymous 10/24/17 (Tue) 09:02:13 No.809615>>809636 >>809648

i made the fastest version:

extern crate memchr;
extern crate memmap;
extern crate rayon;

use std::cmp::{min};
use memchr::{memchr};
use memmap::{Mmap};
use rayon::prelude::*;

const MAX_TEST_CASES: usize = 40;
const MAX_N: usize = 100;
const MIN_L: usize = 10000;
const MAX_L: usize = 30000;
const MIN_LINE_LEN: usize = MIN_L + MIN_L - 1;
const MAX_LINE_LEN: usize = MAX_L * 3 + MAX_L - 1;
const MAX_TESTS_LEN: usize = MAX_LINE_LEN * MAX_TEST_CASES + MAX_TEST_CASES;

fn main() {
	let mut args = std::env::args_os().skip(1);
	assert_eq!(args.len(), 1);

	let mmap = Mmap::open_path(args.next().unwrap(), memmap::Protection::Read).unwrap();
	let slice = unsafe { mmap.as_slice() };

	for &r in do_it(slice).iter().filter(|&r| r.is_some()) {
		match r.unwrap() {
			Some(v) => { println!("{}", v); }
			None => { println!("None"); }
		}
	}
}

fn do_it(slice: &[u8]) -> [Option<Option<usize>>; MAX_TEST_CASES] {
	debug_assert!(slice.len() <= MAX_TESTS_LEN);

	let mut results = [None; MAX_TEST_CASES];
	split_line(slice, &mut results);
	results
}

fn split_line(slice: &[u8], results: &mut [Option<Option<usize>>]) {
	if slice.is_empty() {
		return;
	}

	debug_assert!(slice.len() >= MIN_LINE_LEN);

	let (r, rs) = results.split_first_mut().unwrap();

	match memchr(b'\n', &slice[MIN_LINE_LEN..min(slice.len(), MAX_LINE_LEN + 1)]) {
		Some(i) => {
			let (sl, sr) = slice.split_at(MIN_LINE_LEN + i);
			rayon::join(|| process_line(sl, r), || split_line(&sr[1..], rs));
		}
		None => { rayon::join(|| process_line(slice, r), || {}); }
	}
}

fn process_line(slice: &[u8], result: &mut Option<Option<usize>>) {
	debug_assert!(slice.len() >= MIN_LINE_LEN);
	debug_assert!(slice.len() <= MAX_LINE_LEN);

	let (c, m) = slice.par_split(|&b| b == b',')
		.map(|b| parse(b))
		.fold(|| (0, [0; MAX_N + 1]), |(c, mut m), i| { m[i] += 1; (c + 1, m) })
		.reduce(|| (0, [0; MAX_N + 1]), |(c1, mut m1), (c2, m2)| { for i in 0..MAX_N + 1 { m1[i] += m2[i]; } (c1 + c2, m1) });

	match m.par_iter().cloned().enumerate().find_any(|&(_, v)| v >= c / 2) {
		Some((i, c)) => {
			debug_assert!(c >= MIN_L);
			debug_assert!(c <= MAX_L);
			*result = Some(Some(i));
		}
		None => { *result = Some(None); }
	}
}

fn parse(b: &[u8]) -> usize {
	debug_assert!(b.iter().all(|&b| b >= b'0' && b <= b'9'));
	debug_assert!(std::str::from_utf8(b).unwrap().parse::<usize>().unwrap() <= MAX_L);

	(match b.len() {
		1 => { b[0] - b'0' }
		2 => { (b[0] - b'0') * 10 + (b[1] - b'0') }
		3 => { 100 }
		_ => { unreachable!() }
	}) as usize
}

▶Anonymous 10/24/17 (Tue) 09:23:32 No.809622>>809627 >>809628

>>809606

sheeeeeeiiiiiit im retarded

▶Anonymous 10/24/17 (Tue) 09:36:36 No.809627

>>809606

>>809600

>>809622

that was in the back of my mind too until i thought about it for a second

▶Anonymous 10/24/17 (Tue) 09:38:52 No.809628>>810021

>>809622

We all have our moments of diversity.

▶Anonymous 10/24/17 (Tue) 09:53:00 No.809636>>809638 >>809640 >>809643

>>809615

How do you run this shit? I tried running it and got:


thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted

Also, how does one even know what source is being pulled into these rust projects?


 Downloading rayon-core v1.2.1
 Downloading futures v0.1.16
 Downloading coco v0.1.1
 Downloading rand v0.3.17
 Downloading lazy_static v0.2.9
 Downloading num_cpus v1.7.0
 Downloading scopeguard v0.3.3
 Downloading either v1.3.0
 Downloading memmap v0.5.2
 Downloading libc v0.2.32
 Downloading memchr v2.0.0

Has anyone ever looked at what these are? How many are from complete nobodies with no oversight and no maintenance? How do I get the exact source that the version installed corresponds to to see if they're evil?

▶Anonymous 10/24/17 (Tue) 09:58:27 No.809638>>809639

>>809636

holy fuck you stupid nigget increase you stack size limit

▶Anonymous 10/24/17 (Tue) 10:00:39 No.809639

>>809638

I have no idea how to make your shitty language work. If you need stupid amounts of ram or whatever, say what options to build with.

▶Anonymous 10/24/17 (Tue) 10:00:56 No.809640

>>809636

look into the target folder. cargo caches the crates there

▶Anonymous 10/24/17 (Tue) 10:19:56 No.809643>>809644

>>809636

>rust happily pulls in random sources from random places

great language steve

▶Anonymous 10/24/17 (Tue) 10:25:04 No.809644>>809650

>>809643

>random sources

the Cargo.toml tells you exactly what gets pulled

>random places

it pulls the requested crates from crates.io

very low energy bait.

▶Anonymous 10/24/17 (Tue) 10:33:33 No.809648>>809652 >>809704 >>809770

>>809615

>i made the fastest version

dohoho, we'll see about that.

I built your shit on an i7-6700k in --release mode and ran the binary directly rather than via cargo using test data from >>809582 and compared it to mine in >>808872 . If there's some other bullshit I need to do to make rust go fast, tell me. Otherwise,

Yours:


real    0m0.020s
user    0m0.108s
sys     0m0.016s

Mine:


real    0m0.009s
user    0m0.008s
sys     0m0.000s

You got beat by someone who identifies as a girl for purposes of determining score! Girl power!

And not just beat, fucking destroyed. Let's look closer at that. You used 8 threads (6700k) against a single-threaded program and still lost. Not only did you lose, you managed to waste 0.108s of CPU time for a 0.020s runtime. So in reality rather than an artificial test where your program would be competing for CPU time with other programs, it's more accurate to say you need 108ms of CPU time to produce a result vs 8ms of CPU time. You also used 3x the ram. gj Rust community!

▶Anonymous 10/24/17 (Tue) 10:38:27 No.809650>>809653 >>809671

>>809644

Have you looked at any of it? I looked at what was being pulled in via rayon just now and they're unsafely trusting the environment. I can make any Rust program using that crate crash if I can get "RAYON_NUM_THREADS" into the environment. Did the Rustfags learn nothing from shellshock? Fucking christ, this is like a '90s tier exploit. Webdevs have no business writing systems code.

▶Anonymous 10/24/17 (Tue) 10:41:56 No.809652>>809654

>>809648

fugg XDDDD

my rust version is really way slower

▶Anonymous 10/24/17 (Tue) 10:43:14 No.809653

>>809650

Also, the funny thing is rayon disables checking for RAYON_LOG in release builds as they recognize trusting the environment is unsafe (I was hoping I could overwrite your /etc/passwd), but they neglected to do that for their other environment variables. Check log.rs and lib.rs. I wonder how much of the Rust ecosystem is this insecure, or outright malicious, since no one checks.

▶Anonymous 10/24/17 (Tue) 10:48:30 No.809654

>>809652

shit i really fucked up bigly. for some reason this shit doesnt even utilize all my cores.

well at least is is memory and thread safe XDDDDDDD

▶Anonymous 10/24/17 (Tue) 11:20:07 No.809666>>809670

File (hide): a4461b39254ebf9⋯.png (2.83 KB, 556x77, 556:77, hiss hiss.png) (h) (u)

>>809592

with the modified function from >>809602

▶Anonymous 10/24/17 (Tue) 11:26:56 No.809670>>809672

>>809666

where are these tests coming from?

▶Anonymous 10/24/17 (Tue) 11:27:11 No.809671>>809680

>>809650

>trusting the environment is bad

What the fuck am I reading? If you can't trust the environment you have bigger problems than someone setting RAYON_NUM_THREADS to over 9000.

▶Anonymous 10/24/17 (Tue) 11:27:48 No.809672>>809677

>>809670

from some pajeet code challenge/practice website thing

▶Anonymous 10/24/17 (Tue) 11:40:40 No.809677

File (hide): 9050ca0ee319603⋯.jpg (28.38 KB, 255x292, 255:292, 1488571705237.jpg) (h) (u)

>>809672

>can't solve the challenge

>has an autism fit when other people solve it

>calls everything pajeet

stay mad kiddo

▶Anonymous 10/24/17 (Tue) 11:46:35 No.809680>>809686

>>809671

It's like you've already forgotten the lessons re-learned from shellshock. The environment should not be trusted. There are too many ways to get information into it, and it's led to hundreds of exploits over the years.

▶Anonymous 10/24/17 (Tue) 11:56:54 No.809686>>809688 >>809710

>>809680

>shellshock

>Many Internet-facing services, such as some web server deployments, use Bash to process certain requests

LOL. bash did nothing wrong. webdevs did

>The environment should not be trusted

>int fd = open(argv[1], O_RDONLY);

LOL. webdev confirmed

>There are too many ways to get information into it, and it's led to hundreds of exploits over the years.

the same can be said about sql queries.

▶Anonymous 10/24/17 (Tue) 12:04:09 No.809687

>>809582

It's like you don't know Python.


#!/usr/bin/python3
ii=__import__;i=ii("random");r,ri=i.random,i.randint;m=lambda lll:str(int(lll()));[getattr(__builtins__,getattr("","".join([chr(i) for i in sum([[j,j+5] for j in [[k*5+1,k*5] for k in [21]][0]],[])]))([chr(i) for i in [112,114,105,110,116]]))(",".join([getattr(m,"".join([j if chr(99)==j[0] else i for i,j in [[(['__']*3)[i],['wall', 'call', 'mall'][i]] for i in range(3)]]))(lambda:ri(0,100)) for j in range(ri(10000, 30000))])) for i in range(40)]

Sadly can't add bias here, though.

▶Anonymous 10/24/17 (Tue) 12:04:47 No.809688>>809689

>>809686

>numerous environment based CVEs dating back 20 years

>interpreters like perl, python, php, and ruby provide ways to disable dangerous environment variables

>numerous attacks on suid binaries through environment variables they didn't know they had

>rayon itself disables almost all of its dangerous environment variables in release mode, but misses two

"Nope, must not be a security problem!"

You're a fucking moron, anon. Please never work on anything important.

▶Anonymous 10/24/17 (Tue) 12:12:21 No.809689>>809691 >>809710

>>809688

lol stop sperging out, webdev. it is not a programs fault if its environment is compromised. maybe webdevs should stop putting malicious stuff into the environment????

>rayon itself disables almost all of its dangerous environment variables in release mode, but misses two

literally the only two environment variables rayon reads are RAYON_NUM_THREADS and RAYON_LOG: https://github.com/rayon-rs/rayon/search?utf8=%E2%9C%93&q=env%3A%3Avar&type=

rayon enables logging only in debug mode or if you enable a compiler flag. it isnt diabled because "muh dangerous env variables".

fucking kill yourself you fag.

▶Anonymous 10/24/17 (Tue) 12:19:47 No.809691>>809693

>>809689

Hello, eternal September. Enjoy re-learning what we learned the hard way. lol.

▶Anonymous 10/24/17 (Tue) 12:22:51 No.809693>>809694

>>809691

not an argument, lol. reading environment variables is totally ok and if there is shit in there that isnt supposed to be in there, it isnt my fault, but the webdevs who put it there.

if you dont a particular program using environment variables because you are a webdev and might have put shit into it, just clear them.

▶Anonymous 10/24/17 (Tue) 12:28:10 No.809694>>809697

>>809693

>if there is shit in there that isnt supposed to be in there, it isnt my fault, but the webdevs who put it there

Quality defense-in-depth on display from the eternal September.

▶Anonymous 10/24/17 (Tue) 12:42:44 No.809697>>809700

>>809694

>defense-in-depth

holy shit you fucking retard. you won. i wont read env variables anymore. i also wont use command line arguments, files, user input and all that good stuff. after all, if my programm does absolutely nothing than it cant do anything wrong, right?

in fact you advice made even rust obsolete, since accessing memory is insecure.

▶Anonymous 10/24/17 (Tue) 12:51:34 No.809700>>809701

>>809697

You type like a child.

▶Anonymous 10/24/17 (Tue) 12:52:48 No.809701

>>809700

amazing argument. you win 10 interwebz

▶Anonymous 10/24/17 (Tue) 12:59:32 No.809703>>809706 >>809709 >>809731 >>810187 >>811220


#include <iostream>
#include <fstream>
#include <cstring>

#define N 101

int main(int argc, char** argv) {
    uint_fast16_t count[N];
    std::memset(count, 0, sizeof count);

    if (std::ifstream infile{argv[1], std::ios::binary}) {
        uint_fast16_t n = 0;
        uint_fast16_t L = 0;
        while (infile >> n) {
            ++count[n];
            ++L;
            if (infile.peek() == ',') {
                infile.ignore(1);
            }
            else {
                L /= 2;

                bool found = false;
                for (size_t i = 0; i < N; ++i) {
                    if (count[i] >= L) {
                        std::cout << i << "\n";
                        found = true;
                        break;
                    }
                }
                if (!found) {
                    std::cout << "None\n";
                }

                std::memset(count, 0, sizeof count);
                L = 0;
            }
        }
    }
}

▶Anonymous 10/24/17 (Tue) 13:00:55 No.809704>>809707

>>809648

What about memory usage in either of those programs?

▶Anonymous 10/24/17 (Tue) 13:04:57 No.809705

File (hide): 8eb4340ed4f1146⋯.jpg (82.33 KB, 960x712, 120:89, 1485973317300.jpg) (h) (u)

>>809307

euro cs student here,

people graduating from cs in europoora can't even write a for loop

▶Anonymous 10/24/17 (Tue) 13:05:36 No.809706

>>809703

i didn't think about doing it that way at all, i'm going to write something similar in python and see how shitty the python performance doing the same type of thing is.

▶Anonymous 10/24/17 (Tue) 13:06:21 No.809707>>809810

>>809704

I mentioned the core size difference was 3x. The C version is about 3M (the size of the worst-case test file that it reserves space for). The rust version is about 9M but I have no idea how to reason about where the memory is getting used as I don't know rust. It might just be from all the stack segments for the threads.

▶Anonymous 10/24/17 (Tue) 13:09:36 No.809709

File (hide): 889881613263866⋯.png (2.34 KB, 540x70, 54:7, c plus equality.png) (h) (u)

>>809703

▶Anonymous 10/24/17 (Tue) 13:10:50 No.809710>>809716 >>809717

>>809689

>webdevs should stop putting malicious stuff into the environment

they don't. CGI does. Any webserver that implements it correctly will do likewise. It's webdevs who are responsible for being three degrees removed from the people responsible for responsibly handling their environment variables, just like any other program input.

>>809686

>bash did nothing wrong

kill yourself. this was a completely retarded 'feature' that would've been a massive problem even if the web had not existed at all. How common is it for programs to cheap out and invoke a subshell when not strictly necessary? Especially when the command is safe on its face? Bash made that unsafe.

EATSHIT="I don't even remember how this useless 'feature' goes" /usr/bin/some_setuid_admin_bin

^ now any subshells also does an rm -rf as root. Nice fucking job.

▶Anonymous 10/24/17 (Tue) 13:22:52 No.809716

>>809710

>CGI does

ok. but does cgi write to RAYON_NUM_THREADS or RAYON_LOG?

▶Anonymous 10/24/17 (Tue) 13:25:51 No.809717

>>809710

How is any of this malicious?

http://www.cgi101.com/book/ch3/text.html

▶Anonymous 10/24/17 (Tue) 13:27:50 No.809719>>809720

can you two autists go get your own thread where you can hatefuck each other all you want?

▶Anonymous 10/24/17 (Tue) 13:29:31 No.809720

>>809719

right after you post some code

▶Anonymous 10/24/17 (Tue) 14:05:25 No.809731>>809735 >>811220

>>809703

Slightly changed to use more idiomatic C++


#include <iostream>
#include <fstream>
#include <array>
#include <algorithm>
#include <iterator>

#define N 101

int main(int argc, char** argv) {
    std::array<uint_fast16_t, 101> count;
    std::fill(count.begin(), count.end(), 0);

    if (std::ifstream infile{argv[1], std::ios::binary}) {
        uint_fast16_t n = 0;
        uint_fast16_t L = 0;
        while (infile >> n) {
            ++count[n];
            ++L;
            if (infile.peek() == ',') {
                infile.ignore(1);
            }
            else {
                L /= 2;

                auto loc = std::find_if(count.begin(), count.end(), [&](uint_fast16_t& i) {
                    return i >= L;
                });

                if (loc != std::end(count)) {
                    std::cout << std::distance(count.begin(), loc) << "\n";
                }
                else {
                    std::cout << "None\n";
                }

                std::fill(count.begin(), count.end(), 0);
                L = 0;
            }
        }
    }
}

▶Anonymous 10/24/17 (Tue) 14:17:55 No.809735>>809738

>>809731

i really need to stop wasting time with python.

▶Anonymous 10/24/17 (Tue) 14:19:33 No.809738>>809744

>>809735

what do you mean?

▶Anonymous 10/24/17 (Tue) 14:27:13 No.809744>>809746 >>809751

>>809738

i've spent way too long writing everything in python and it would take time now to get back into C. python is great for glue and all the libraries and all that but clearly the performance no matter what you do is going to be 100 times worse than C.

▶Anonymous 10/24/17 (Tue) 14:34:10 No.809746

File (hide): 1654e6e6e6c9cea⋯.jpg (111.58 KB, 1920x1080, 16:9, 19640623091.jpg) (h) (u)

>>809744

mfw even perl5 is faster than python

▶Anonymous 10/24/17 (Tue) 14:42:08 No.809747>>809759 >>809773

Since the rustfag is trying to beat my single-threaded code with threads, I'm going to swoop in here and 'improve' my entry. It is unbeatable without just micro-optimizing this code and resubmitting it. Bow down to your queen, and learn something from my superior technique. Time this with that site of yours, and it better have more than one goddamn core available after I spent time doing this.


#define _GNU_SOURCE
#include <unistd.h>
#include <sys/mman.h>
#include <sched.h>
#include <errno.h>
#include <stdio.h>
#include <pthread.h>
#include <stdint.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define MAX_WORKERS 64
#define MAX_CASES 40

static char *buffer;
static uint_fast16_t const tens[] = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90};
static long processors;
static off_t size;

static inline uint_fast8_t parse_n(char * restrict * restrict c) {
  uint_fast8_t retval = 0;

  char *begin = *c;
  while(**c >= '0' && **c <= '9') {
    ++*c;
  }

  int len = *c - begin;

  switch(len) {
    case 3: retval = 100; break;
    case 2: retval = tens[begin[len - 2] - '0'];
    case 1: retval += begin[len - 1] - '0';
  }

  return retval;
}

typedef struct {
  pthread_t thread;
  char *begin;
  char answer[MAX_CASES][5];
  int answers;
  uint_fast16_t counts[100 + 1];
} __attribute__((aligned(64))) worker_t;
worker_t workers[MAX_WORKERS];

void *thread_start(void *data) {
  long processor = (long) data;
  worker_t *worker = &workers[processor];
  uint_fast8_t n;
  uint_fast16_t l;
  int i;

  char *c = buffer + (size * processor) / processors;
  if(c > buffer) {
    c = strchr(c - 1, '\n') + 1;
  }
  if(!*c) return NULL;
  worker->begin = c;

  char *end;
  if(processor < processors) end = buffer + (size * (processor + 1)) / processors;
  else end = buffer + size;

  goto start;
  for(; c < end; ++c) {
    (void) memset(worker->counts, 0, sizeof(worker->counts));
start:
    for(l = 1; ; ++l, ++c) {
      n = parse_n(&c);
      ++worker->counts[n];
      if(*c == '\n') break;
    }

    uint_fast16_t half = l >> 1;
    for(i = 0; i < sizeof(worker->counts) / sizeof(worker->counts[0]); ++i) {
      if(worker->counts[i] > half) {
        sprintf(worker->answer[worker->answers], "%d", i);
        break;
      }
    }
    if(i == sizeof(worker->counts) / sizeof(worker->counts[0])) sprintf(worker->answer[worker->answers], "None");
    ++worker->answers;
  }

  return NULL;
}

int main(int argc, char *argv[]) {
  struct stat stat;

  int fd = open(argv[1], O_RDONLY);
  fstat(fd, &stat);
  size = stat.st_size;
  buffer = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);

  errno = 0;
  processors = sysconf(_SC_NPROCESSORS_ONLN);
  if(processors > MAX_WORKERS) processors = MAX_WORKERS;

  for(long processor = 0; processor < processors; ++processor) {

    pthread_attr_t attr;
    pthread_attr_init(&attr);

    cpu_set_t cpus;
    CPU_ZERO(&cpus);
    CPU_SET(processor, &cpus);

    pthread_attr_setaffinity_np(&attr, sizeof(cpu_set_t), &cpus);

    pthread_create(&workers[processor].thread, &attr, &thread_start, (void *) processor);
  }

  for(long processor = 0; processor < processors; ++processor) {
    worker_t const *worker = &workers[processor];
    pthread_join(worker->thread, NULL);
  }

  for(long processor = 0; processor < processors; ++processor) {
    worker_t const *worker = &workers[processor];
    if(processor + 1 < processors && workers[processor].begin == workers[processor + 1].begin) continue;
    for(int i = 0; i < worker->answers; ++i) {
      printf("%s\n", worker->answer[i]);
    }
  }

  return 0;
}

Thing is, if you look close, this is maximum faggotry. I would never write code like this normally. It is highly optimized to have an extremely good wall clock time (how these tests are scored) but will sacrifice CPU time to do it. In the real world where you have other shit running, that's pretty dumb. But winning is everything, so whatever. Some highlights:

- It gambles that the tests are being run in an unclean state and uses mmap() to swipe the data from the fs cache. mmap() is slower in the real world due to the cost of page faults.

- It divides the buffer across threads and has each thread find its starting position independent of the others rather than what you're taught in school: to make a list of the work (e.g., split the lines) then divide it amongst work queues. This is a big optimization for wall-clock time but it means there is overlap and some results might be computed and thrown away (wasted CPU time). Threads will have a less balanced workload with this technique, but it's a huge win for wall clock time not having to scan the entire file on a single thread, so it works out.

▶Anonymous 10/24/17 (Tue) 14:44:22 No.809751

>>809744

FWIW, I (your queen) heavily use Python, C, and C++. They're all great at their niche. Just don't use the wrong tool for the job.

▶Anonymous 10/24/17 (Tue) 14:55:51 No.809759>>809761

>>809747


/tmp/ccMmS4zt.o: In function `main':
source.c:(.text+0x364): undefined reference to `pthread_attr_setaffinity_np'
source.c:(.text+0x37c): undefined reference to `pthread_create'
source.c:(.text+0x3b0): undefined reference to `pthread_join'
collect2: error: ld returned 1 exit status

Sorry queen

▶Anonymous 10/24/17 (Tue) 14:58:31 No.809761>>809762

>>809759

-lpthread, newfag.

▶Anonymous 10/24/17 (Tue) 14:59:54 No.809762>>809766 >>809767

>>809761

Can't influence switches on the testing site, sorry ~~queer~~ queen

▶Anonymous 10/24/17 (Tue) 15:03:52 No.809766>>809768 >>809770

>>809762

That's pretty terrible. So it allows threads in C++ and Rust but locks out C? What a Pajeet website.

▶Anonymous 10/24/17 (Tue) 15:05:46 No.809767>>809771

>>809762

>I would immediately and comfortably use a library for this in any real-world circumstance

>my professor's test environment doesn't include it

quit college. try making demands for my money back no matter how unlikely that is. drop hints about coming back to get a degree in philosophical proctology

>my code wins if you pass -Ogo-faster

>but this testing site only caters to 'default' language features

sell all property

go to renfair

buy bastard sword

travel to physical locations of testing site's admins

"shall we cross steel, churl?"

▶Anonymous 10/24/17 (Tue) 15:05:51 No.809768

>>809766

I haven't checked if it supports C++ threads.

It doesn't support Rust at all lel

▶Anonymous 10/24/17 (Tue) 15:06:30 No.809770>>809772

>>809766

Since it locks it out, here's results from a 6700k using that test data from python,


real    0m0.001s
user    0m0.004s
sys     0m0.000s

1 MS

Compare with the single-threaded code I gave numbers for in >>809648,


real    0m0.009s
user    0m0.008s
sys     0m0.000s

So I'd expect if that shitty site could run it, since it was at 3ms in >>809255, it might register as 0ms.

▶Anonymous 10/24/17 (Tue) 15:07:09 No.809771

File (hide): c0f198f244f5158⋯.jpg (86.64 KB, 640x633, 640:633, 1458672782192.jpg) (h) (u)

>>809767

▶Anonymous 10/24/17 (Tue) 15:09:17 No.809772

>>809770

The site seems to value low memory usage more than cpu usage after a certain threshold for cpu time has been crossed.

▶Anonymous 10/24/17 (Tue) 15:09:37 No.809773>>809774

>>809747

i changed your single threaded version to also memory map the file and it is as fast as your multithreaded version

▶Anonymous 10/24/17 (Tue) 15:13:29 No.809774>>809779

>>809773

It needs a large test case to notice a difference. The script in >>809582 should be enough. It scales really well so the larger the data gets the better it'll do.

▶Anonymous 10/24/17 (Tue) 15:17:09 No.809779>>809780 >>809783

>>809774

lol no i tripled the number of tests and now the single threaded version is faster

▶Anonymous 10/24/17 (Tue) 15:18:36 No.809780>>809784

>>809779

You might have a shitty CPU, anon. You can try limiting the MAX_WORKERS to half to see if you've got a bad implementation of hyperthreads.

▶Anonymous 10/24/17 (Tue) 15:20:42 No.809783

>>809779

Also, if you've got something burning CPU for no reason, I set affinity on each thread and the slowest one determines your runtime so that will create a bottleneck. You can comment out the affinity stuff and have the scheduler route around that, but it's less efficient.

▶Anonymous 10/24/17 (Tue) 15:21:18 No.809784>>809791

>>809780

i tried 32, 16, 8 and 4 workers. single threaded version is still better.

▶Anonymous 10/24/17 (Tue) 15:28:32 No.809790>>809793 >>809794

Queen,

what is your dayjob? Are you from NA or EU?

▶Anonymous 10/24/17 (Tue) 15:30:01 No.809791>>809792

>>809784

It points to a problem with your setup. The bottleneck code is essentially the same other than setting affinity to run on a core the scheduler might not have otherwise chosen. You should look into it further.

▶Anonymous 10/24/17 (Tue) 15:30:48 No.809792>>809800

>>809791

no my setup is fine. it is just that your multi threaded version is shit. modify the single threaded to use mmap and see for yourself

▶Anonymous 10/24/17 (Tue) 15:31:22 No.809793>>809795

>>809790

I do low-level networking on embedded devices and gamedev. Simultaneously. It's been a strange ride. I'm currently working 12 hours a day.

▶Anonymous 10/24/17 (Tue) 15:31:49 No.809794>>809797

>>809790

Also, in socal. And I've not slept yet. It's 8:30am..

▶Anonymous 10/24/17 (Tue) 15:32:14 No.809795>>809796 >>809801

>>809793

>working 12 hours a day

>still shitposts on /tech/

i smell LARPing

▶Anonymous 10/24/17 (Tue) 15:34:35 No.809796>>809799

>>809795

i doubt a LARPer could produce the C source in this thread

▶Anonymous 10/24/17 (Tue) 15:35:31 No.809797>>809804

>>809794

>socal

are you also actually a girl, because that'd be hilarious

▶Anonymous 10/24/17 (Tue) 15:35:53 No.809799

>>809796

i mean it really isnt that hard. if you cant do that you should consider enrolling into a cs course

▶Anonymous 10/24/17 (Tue) 15:36:14 No.809800>>809802

>>809792

Modified to use mmap:


real    0m0.008s
user    0m0.008s
sys     0m0.000s

Again, your setup is shit. Like I said, there's no significant change in the bottleneck. It's purely going to be an issue with affinity that you can solve.

▶Anonymous 10/24/17 (Tue) 15:38:10 No.809801>>809803

>>809795

I had today off to deal with an immigration interview for my wife (she's not mail order, just Canadian).

▶Anonymous 10/24/17 (Tue) 15:38:23 No.809802>>809805

>>809800

no you fag. my setup is fine. your shitty multithreaded version just sucks.

▶Anonymous 10/24/17 (Tue) 15:38:44 No.809803

>>809801

>canadians cant be mail order

▶Anonymous 10/24/17 (Tue) 15:39:03 No.809804>>809806

>>809797

No, I'm a 40 year old WHITE MALE.

▶Anonymous 10/24/17 (Tue) 15:39:20 No.809805

>>809802

▶Anonymous 10/24/17 (Tue) 15:40:09 No.809806

>>809804

Hi i'm tyrone, want me to take care of your wife?

▶Anonymous 10/24/17 (Tue) 15:54:41 No.809810>>809821

>>809707

B-b-but muh SAFE memory usage!!!! It's safe.

▶Anonymous 10/24/17 (Tue) 16:16:22 No.809821

>>809810

yes it is safe. when i wrote this piece of shit rust code i intended it go fast and didnt give a shit about memory usage. unfortunately i didnt benchmark this trashy codevomit before i proudly presented it to you.

▶Anonymous 10/24/17 (Tue) 16:19:28 No.809823

>>809815

>50%+1 times, only use half of the sequence.

No, you cannot assume that.

▶Anonymous 10/24/17 (Tue) 16:22:02 No.809825>>809826 >>809827 >>809829 >>809835 >>809856 >>810015

i deleted the last post because i accidentally posted the wrong fucked up code. this is the correct code, and time is down to 1.0-1.2s compared to 2.2s with

>>809592

still obviously shit compared to these C solutions, it's 1.8s with 1 process.

the improvement here is get rid of half of the sequence immediately, since the number is guaranteed to be in 50%+1 of that sequence.


#!/bin/python2.7
import argparse
import sqlite3
from multiprocessing import Pool

def dowork(sequence):
	conn=sqlite3.connect(':memory:')
	conn.execute ('PRAGMA journal_mode = OFF')
	conn.execute ('PRAGMA synchronous = OFF')
	conn.execute ('PRAGMA locking_mode = EXCLUSIVE')

	cursor=conn.cursor()

	seq_list=sequence.split(',')
	target_length=(len(seq_list)/2)

	#drop half of the sequence, unnecessary
	seq_list=seq_list[:target_length]

	seq_list=[(x,) for x in seq_list ]

	cursor.execute ('CREATE TABLE challenge (integers integer)')
	cursor.executemany('INSERT INTO challenge VALUES (?)',seq_list)
	conn.commit()

	cursor.execute('''SELECT integers,COUNT(1) as integers FROM challenge GROUP BY integers
		ORDER BY COUNT(1) DESC LIMIT 1''')
		
	fetch=cursor.fetchone()

	cursor.execute('DROP TABLE challenge')
	conn.commit()
	conn.close()

	if fetch==None:
		return None
	elif fetch[1]>((target_length/2)-1):
		return fetch[0]
	else:
		return None

def main():
	parser=argparse.ArgumentParser()
	parser.add_argument('filename',type=str)
	args=parser.parse_args()
	try:
		sequences=open(args.filename,'r').read().rstrip().split('\n')
	except:
		print 'None'

	p=Pool(1)
	majors=p.map(dowork,sequences)
	for major in majors:
		print major

	dowork(sequences[0])

if __name__=="__main__":
	main()

▶Anonymous 10/24/17 (Tue) 16:23:30 No.809826

>>809825

>since the number is guaranteed to be in 50%+1 of that sequence.

Stop saying that. How many ways are there to select HALF of the numbers?

▶Anonymous 10/24/17 (Tue) 16:23:41 No.809827

>>809825

i need a drink, still fucked up the post, delete

dowork(sequences[0]) in main(). that was just there for testing.

▶Anonymous 10/24/17 (Tue) 16:34:46 No.809829>>809830 >>809856

File (hide): 1e658c3b79efefb⋯.png (638.07 KB, 586x962, 293:481, african science.png) (h) (u)

>>809825


>	seq_list=sequence.split(',')
>	target_length=(len(seq_list)/2)

>	#drop half of the sequence, unnecessary
>	seq_list=seq_list[:target_length]

▶Anonymous 10/24/17 (Tue) 16:36:29 No.809830>>809831 >>809832

>>809829

that is a very happy looking nigger. what is he looking at?

▶Anonymous 10/24/17 (Tue) 16:38:43 No.809831

>>809830

You're in for a treat. Just watch the first minute.

https://www.youtube.com/watch?v=WaM47Otif04

▶Anonymous 10/24/17 (Tue) 16:38:50 No.809832

>>809830

https://www.youtube.com/watch?v=C9SiRNibD14

▶Anonymous 10/24/17 (Tue) 16:45:09 No.809835

>>809825

sergio, el scripto no werko

▶Anonymous 10/24/17 (Tue) 16:51:04 No.809836>>809838 >>809885

File (hide): 8b5ebf32b3575db⋯.gif (16 KB, 125x125, 1:1, 1409835665168.gif) (h) (u)

Noob here, why is the C++ solution more concise than the C one but still rather efficient? Isn't C++ shit?

▶Anonymous 10/24/17 (Tue) 16:56:42 No.809838>>809841

>>809836

>why do I need less code when I use higher-level code

because you're getting more done per line, duh. the parse_n in the cheater threads code would not even take up a single line in a scripting language; it'd be one subexpression of a line, like string.gmatch in the lua:

  print(major(string.gmatch(line, "%d+")) or "None")

language shitness isn't just about code terseness, or we would also shill APL

▶Anonymous 10/24/17 (Tue) 17:11:56 No.809841>>809842 >>809845

>>809838

But people here say C is the best and shit like C++ and Java are bloated heaps of trash.

▶Anonymous 10/24/17 (Tue) 17:13:37 No.809842>>809857

>>809841

i'm waiting for someone to post the x86 assembly solution.

▶Anonymous 10/24/17 (Tue) 17:19:42 No.809845>>809858

>>809841

again: is APL the best? There's more to language shitness than tersity. Tersity doesn't even reliably translate to humans spending less time on writing/reading/troubleshooting that span of code.

▶Anonymous 10/24/17 (Tue) 18:04:36 No.809856>>809886 >>810015

File (hide): 55ae5826c7a5215⋯.gif (173.35 KB, 1000x164, 250:41, slideit.gif) (h) (u)

>>809825

>>809829

>the improvement here is get rid of half of the sequence immediately

The maximum will occur in some half+1 length subsequence, but you made the false assumption that the subsequence necessarily starts at the beginning. Imagine the subsequence as a sliding window, and you'll see why that assumption is wrong.

See gif related.

▶Anonymous 10/24/17 (Tue) 18:05:42 No.809857

>>809842

>not x86_64

Not wonder why you arent Gods chosen high priest

▶Anonymous 10/24/17 (Tue) 18:07:26 No.809858>>809860

>>809845

/tech/ said C++ is shit so i refuse to believe it can be as good as gods language also known as C.

▶Anonymous 10/24/17 (Tue) 18:17:00 No.809860>>809864

>>809858

ok mang. You saw through my attempts at deflection. Here's the truth: C is God's language and C++ is complete shit. C++'s shitness does not extend to it not having a number of conveniences that show off in toy code like this. The devil's succubi do not appear as ugly hags, but beautiful maidens. Similarly C++ has superficial benefits that lure you into letting it waste tremendous amounts of brainspace on it. C++ is the devil's own attempt at a DOS attack on your brain. Though warlocks and sinners believe that they can deal with the devil, extract some narrow advantage ("I'll just use Qt for the GUI"), and then flee, they invariably fall to hell as well, and become mad and incapable of safe coding. Although some will claim to have successful careers, this is a combination of survivor's bias and the AIDS-patient mental weakness of such people not coming across in text.

Save yourself.

▶Anonymous 10/24/17 (Tue) 18:22:42 No.809861>>809866 >>811099 >>811220

>>808650 (OP)


package main

import (
	"bufio"
	"fmt"
	"os"
	"strconv"
	"strings"
)

func process(filename string) {
	f, err := os.Open(filename)
	if err != nil {
		panic(err)
	}
	defer f.Close()
	s := bufio.NewScanner(f)
	for s.Scan() {
		L := 0
		hist := make([]int, 101)
		numStrs := strings.Split(s.Text(), ",")
		for _, numStr := range numStrs {
			num, err := strconv.Atoi(numStr)
			if err != nil {
				panic(err)
			}
			if num < 0 || num > 100 {
				panic("number out of range")
			}
			L++
			hist[num]++
		}
		func() {
			for num, count := range hist {
				if count > L/2 {
					fmt.Println(num)
					return
				}
			}
			fmt.Println("None")
		}()
	}
}

func main() {
	if len(os.Args) != 2 {
		fmt.Printf("usage: %s [input]\n", os.Args[0])
		return
	}
	process(os.Args[1])
}

am grill

▶Anonymous 10/24/17 (Tue) 18:26:15 No.809864

>>809860

>devil's own attempt at a DOS attack on your brain

kek. Reminds me of my dance with the devil called Haskell. After reading a paper on how the type system can solve a rubiks cube, I really questioned what I was doing with my life.

▶Anonymous 10/24/17 (Tue) 18:26:38 No.809866>>809869 >>811099

File (hide): 52f8611248d580c⋯.png (2.49 KB, 535x67, 535:67, Goo.png) (h) (u)

>>809861

▶Anonymous 10/24/17 (Tue) 18:28:32 No.809869

>>809866

^_^

could have brought memory usage down a bit by reusing hist instead of making a new one, but since Go binaries are usually several megabytes by themselves, there's no point.

▶Anonymous 10/24/17 (Tue) 18:30:36 No.809872>>809877 >>809878 >>809888

Current powerranking:

C/C++ (the site actually gave the C++ one a higher rating even thogh it's somewhat slower)

Python

[Rust] (no metrics exist for the rust program so not sure where it ranks)

▶Anonymous 10/24/17 (Tue) 18:33:38 No.809877

>>809872

>no metrics for the rust program

because the site doesn't support it? what languages does it support then?

▶Anonymous 10/24/17 (Tue) 18:33:46 No.809878>>809919 >>810458

>>809872

You have to use these languages I believe:

> Python 2.7.6

> C 4.8.4

> C++ 4.8.4

> Java 1.8.0

> Ruby 2.3.0

> Perl 5.18.2

> PHP 5.5.9

> Tcl 8.6

> Clojure 1.7.0

> JavaScript 5.3.0

> Scala 2.9.2

> C# 4.0

> Objective-C 4.8.4

> Python 3 3.4.3

> Haskell 7.6.3

> Go 1.5.2

> Bash 4.3.11

> Lua 5.2.3

> R 3.2.3

> Visual Basic.NET 4.0

> D 4.8.4

> Fortran 4.8.4

> Guile 2.0.9

> OCaml 4.01.0

> Scheme

No Erlang, Ada, Rust, APL, J... total pleb tier.

▶Anonymous 10/24/17 (Tue) 18:43:14 No.809885

>>809836

The goal was to be fast, not concise. The C solutions could be made concise.

▶Anonymous 10/24/17 (Tue) 18:46:20 No.809886>>810015

File (hide): 09b025b0c59bcae⋯.png (12.96 KB, 522x140, 261:70, large.png) (h) (u)

>>809856

Ahh, just occurred to me that even this is wrong too. It's relying on the assumption of a continuous subsequence... and why should it be? The number of ways of selecting half+1 from even just 100 is astronomical. Trying to skip inputs is a pure dead end.

▶Anonymous 10/24/17 (Tue) 18:53:55 No.809888>>809891 >>809921

>>809872

>the site actually gave the C++ one a higher rating even thogh it's somewhat slower

Jesus, this site.. What metric do you think it's weighing higher?

▶Anonymous 10/24/17 (Tue) 18:56:27 No.809891

>>809888

their signups are also broken. I never got the confirmation email. searched and it's a known problem.

▶Anonymous 10/24/17 (Tue) 20:05:29 No.809918

#!/bin/bash

declare -a numbers

while read line
do
    IFS=', ' read -r -a array <<< "$line"

    for element in "${array[@]}"
    do
        numbers[$element]=$(( numbers[$element] + 1 ))
    done

    max=0

    for i in "${!numbers[@]}"
    do
        (( numbers[$i] > numbers[$max] )) && max=$i
    done

    if (( numbers[$max] > ${#numbers[@]} / 2 ))
    then
        echo "$max"
    else
        echo "None"
    fi

    for i in "${!numbers[@]}"
    do
        numbers[$i]=0
    done
done < "$1"

▶Anonymous 10/24/17 (Tue) 20:08:18 No.809919

>>809878

>gcc 4.8.4

>python3 3.4.3

>GHC 7.6.3

Is that fucking debian oldoldstable?

▶Anonymous 10/24/17 (Tue) 20:10:29 No.809921>>809922 >>810067

>>809888

CPU time does matter but after a certain limit it doesn't net you any more points. Low memory gives a lot of points.

▶Anonymous 10/24/17 (Tue) 20:17:26 No.809922

>>809921

I wonder if it'd count the shared mmap version as the full file size even though it only actually needs 4k of ram. I'lll send you that version when I get home.

▶Anonymous 10/24/17 (Tue) 20:49:44 No.809928>>809930 >>809973 >>810181

Step aside, Pajeet!

import Control.Arrow
import Data.List
import Data.List.Split
import Data.Maybe
import System.Environment
main = getArgs >>= readFile . head >>= putStr . unlines . map majorElement . lines
majorElement = maybe "None" snd . listToMaybe . uncurry moreThanHalf . (length &&& count) . splitOn ","
count = map (length &&& head) . group . sort
moreThanHalf l = filter ((> l `div` 2) . fst)

▶Anonymous 10/24/17 (Tue) 20:52:39 No.809930

File (hide): 114e9590da56186⋯.png (109.66 KB, 286x217, 286:217, 1468095421068.png) (h) (u)

>>809928

>uncurry

▶Anonymous 10/24/17 (Tue) 22:06:06 No.809973>>809998

>>809928

>bing uncurry

>webpage is just (a -> b -> c) -> ((a, b) -> c)

I don't have enough autism for haskell.

▶Anonymous 10/24/17 (Tue) 23:24:20 No.809998>>810000

>>809973

Imagine you have a machine, that when you feed a coin of type A, it outputs another machine that takes coins of type B. When you give that machine a B, it gives you a C.

What uncurry does, it wraps those details up, and says, if you give me an A and B right now, I'll give you a C.

▶Anonymous 10/24/17 (Tue) 23:27:40 No.810000

>>809998

Actually I missed one step at the end, it's a little more subtle. Uncurry is a machine that takes the machine I first described, and gives you a machine that takes an A and B at the same time. Of course, once you subsequently give that machine an A and B, you get a C.

▶Anonymous 10/24/17 (Tue) 23:53:07 No.810015>>810021

>>809886

>>809856

>>809825

fuck your right, this is solution isn't going to work at all. it only works on the test cases because the major numbers happen to be in > 50% of the first half of the sequence but it doesn't have to be. i thought you could drop every other number instead for a second or something to fix it but that's the exact same problem.

▶Anonymous 10/25/17 (Wed) 00:10:44 No.810021

>>810015

No worries, as >>809628 put it, we all have these moments. The difference is being autistic enough to care in the first place in rigorously attacking the problem. A true Pajeet can shit in his code base like this >>809051 and be proud of it.

▶Anonymous 10/25/17 (Wed) 00:55:12 No.810038>>810043 >>810087 >>810168

ye olde sepples

#include <iostream>
#include <fstream>
#include <string>

template<size_t N>
inline uint16_t count_each(std::string line, uint16_t(& count_of)[N])
{
    uint8_t curval = 0;
    uint16_t n_numbers = 1;
    for (size_t i = 0, sz = line.size(); i < sz; ++i){
        char c = line[i];
        if (c == ','){
            ++count_of[curval];
            curval = 0;
            ++n_numbers;
            continue;
        }
        curval = curval * 10 + (c - '0');
    }
    ++count_of[curval];
    return n_numbers;
}

template<size_t N>
inline size_t argL2(uint16_t(& count_of)[N], uint16_t l2)
{
    for (size_t i = 0; i < N; ++i)
        if (count_of[i] > l2)
            return i;
    return N;
}

inline void major(const char* filename)
{
    std::ifstream file(filename);
    for (std::string line;;){
        getline(file, line);
        if (!file) return;
        uint16_t count_of[101] = {};
        const uint16_t sz = count_each(std::move(line), count_of);
        const size_t res = argL2(count_of, sz / 2);
        if (res == sizeof(count_of) / sizeof(count_of[0]))
            std::cout << "None\n";
        else
            std::cout << res << '\n';
    }
}

int main(int argc, char* argv[])
{
    major(argv[1]);
}

▶Anonymous 10/25/17 (Wed) 01:07:16 No.810043>>810054

>>810038

a simple for loop that reads integers into an array while ifsteam object != ‘,’ would be more succinct.

▶Anonymous 10/25/17 (Wed) 01:32:08 No.810054

>>810043

True, but I was more trying to limit the number of ifstream calls for efficiency.

Don't know how much more efficient it actually is, though :^)

▶Anonymous 10/25/17 (Wed) 02:05:12 No.810067>>810182 >>811220

>>809921

This one should show up as low memory as the buffer is only necessarily 4k, but the site might be shit and count the whole map as memory. If it's fucked like that, I can change it in another way to get it to show low memory. This is just my single-threaded one using mmap() to see if it changes the ranking.


#define _GNU_SOURCE
#include <unistd.h>
#include <sys/mman.h>
#include <sched.h>
#include <errno.h>
#include <stdio.h>
#include <pthread.h>
#include <stdint.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>



#include <unistd.h>
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define MAX_L 30000
#define SEPARATOR_LENGTH 1
#define EOL_LENGTH 1
#define MAX_N_LENGTH 1
#define MAX_CASES 40
#define MAX_LINE_LENGTH ((SEPARATOR_LENGTH * (MAX_L - 1)) + (MAX_N_LENGTH * MAX_L) + EOL_LENGTH)
#define MAX_FILE_LENGTH MAX_LINE_LENGTH * MAX_CASES

char *buffer;
uint_fast64_t counts[100];
uint_fast64_t const tens[] = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90};

inline uint_fast64_t parse_n(char * restrict * restrict c) {
  uint_fast64_t retval = 0;

  char *begin = *c;
  while(**c >= '0' && **c <= '9') {
    ++*c;
  }

  int len = *c - begin;

  switch(len) {
    case 3: retval = 100; break;
    case 2: retval = tens[begin[len - 2] - '0'];
    case 1: retval += begin[len - 1] - '0';
  }

  return retval;
}

int main(int argc, char *argv[]) {
  struct stat stat;
  uint_fast64_t n;
  uint_fast16_t l;
  int i;

  int fd = open(argv[1], O_RDONLY);
  fstat(fd, &stat);
  buffer = mmap(NULL, stat.st_size, PROT_READ, MAP_SHARED, fd, 0);

  char *c = buffer;
  goto start;
  for(; *c; ++c) {
    (void) memset(counts, 0, sizeof(counts));
start:
    for(l = 1; ; ++l, ++c) {
      n = parse_n(&c);
      ++counts[n];
      if(*c == '\n') break;
    }
    uint_fast16_t half = l >> 1;
    for(i = 0; i < 100; ++i) {
      if(counts[i] > half) {
        printf("%d\n", i);
        break;
      }
    }
    if(i == 100) printf("None\n");
  }
}

▶Anonymous 10/25/17 (Wed) 02:08:31 No.810068>>810171

Nice problem.

[spoiler]1. Use a linear time selection algorithm to find the median.

2. The median doesn't necessarily appear >L/2 times, so count the number of occurrences.

3. If it does then return the median, else return None.[/spoiler]

▶Anonymous 10/25/17 (Wed) 02:48:02 No.810087>>810168 >>810182

Exact same as this: >>810038 , but multithreaded:

#include <iostream>
#include <fstream>
#include <string>
#include <thread>
#include <vector>

template<size_t N>
inline uint16_t count_each(std::string line, uint16_t(& count_of)[N])
{
    uint8_t curval = 0;
    uint16_t n_numbers = 1;
    for (size_t i = 0, sz = line.size(); i < sz; ++i){
        char c = line[i];
        if (c == ','){
            ++count_of[curval];
            curval = 0;
            ++n_numbers;
            continue;
        }
        curval = curval * 10 + (c - '0');
    }
    ++count_of[curval];
    return n_numbers;
}

template<size_t N>
inline size_t argL2(uint16_t(& count_of)[N], uint16_t l2)
{
    for (size_t i = 0; i < N; ++i)
        if (count_of[i] > l2)
            return i;
    return N;
}

inline void major(const char* filename)
{
    std::vector<std::string> lines; lines.reserve(40);
    std::ifstream file(filename);
    for (std::string line;;){
        getline(file, line);
        if (!file) break;
        lines.push_back(std::move(line));
    }
    size_t n_threads = std::thread::hardware_concurrency();
    n_threads = n_threads == 0 ? 1 : n_threads;
    n_threads = n_threads > lines.size() ? lines.size() : n_threads;
    std::vector<std::thread> threads(n_threads);
    size_t interval = lines.size() / n_threads;
    for (size_t i = 0, start = 0; i < n_threads; ++i, start += interval){
        threads[i] = std::thread([&lines, start, finish(i < n_threads - 1 ? start + interval : lines.size())]{
            for (size_t i = start; i < finish; ++i){
                uint16_t count_of[101] = {};
                const uint16_t sz = count_each(std::move(lines[i]), count_of);
                size_t res = argL2(count_of, sz / 2);
                lines[i] = res == sizeof(count_of) / sizeof(count_of[0]) ? "None" : std::to_string(res);
            }
        });
    }
    for (std::thread& thr : threads)
        thr.join();
    for (const std::string& res : lines)
        std::cout << res << '\n';
}

int main(int argc, char* argv[])
{
    major(argv[1]);
}

▶Anonymous 10/25/17 (Wed) 02:59:17 No.810094>>810097 >>810100 >>810132 >>810133 >>810137

>>809592

python2.7 solution that's more similar to the C solutions yet still slow as fuck, but faster than the sqlite solution. 1.5s vs 1.8-2s with 8 processes, 40 x 50k lines (i know it's 30k but for similar timing to the sqlite solution tests).


#!/bin/python2.7
import argparse
from multiprocessing import Pool
import cStringIO

def dowork(sequence):
	input_stream=cStringIO.StringIO(sequence)
	count_list=[0]*50001

	num=''
	total_num=0
	next=input_stream.read(1)
	while (next!=''):
		if next!=',':
			num+=next
		else:
			int_num=int(num)
			count_list[int_num]+=1
			total_num+=1
			num=''
		next=input_stream.read(1)

	target_num=int(total_num)/2

	for index,count in enumerate(count_list):
		if count>=target_num:
			return index
	return None

def main():
	parser=argparse.ArgumentParser()
	parser.add_argument('filename',type=str)
	args=parser.parse_args()
	try:
		sequences=open(args.filename,'r').read().rstrip().split('\n')
	except:
		print 'None'

	p=Pool(8)
	majors=p.map(dowork,sequences)
	for major in majors:
		print major

if __name__=="__main__":
	main()

▶Anonymous 10/25/17 (Wed) 02:59:58 No.810097

>>810094

change count_list[0]*50001 to

count_list[0]*30001

▶Anonymous 10/25/17 (Wed) 03:05:23 No.810100>>810102

>>810094

looking at it now this is a stupid way of doing it i should be counting numbers 0-100 not every one of the 30k integers.

▶Anonymous 10/25/17 (Wed) 03:07:46 No.810102

>>810100

which is exactly what its doing, fml, just change count_list[0]*50000 to count_list[0]*101, which is all it's using of that list anyway

performance now 1.3s compared to 1.5

▶Anonymous 10/25/17 (Wed) 04:45:03 No.810132>>810182 >>811220

>>810094

>ugh all this disgusting python

have some Serious Perl:

#! /usr/bin/env perl
use common::sense;

LINE: while (defined(my $line = <>)) {
  my %counts;
  my @numbers = $line =~ /(\d+)/g;
  for (@numbers) {
    if (++$counts{$_} > @numbers/2) {
      print $_, "\n";
      next LINE;
    }
  }
  print "None\n";
}

▶Anonymous 10/25/17 (Wed) 04:48:27 No.810133>>810221

improved version that only parses the data once.

>>810094

reads the data twice, the first time to split on newlines and then again to parse the sequences. this version does it all at once. It still reads the file into a string first before parsing, reading the file directly during parsing was 20% slower than using cstring.

it performs about the same single threaded and worse multi-threaded, because there's less to multithread.


#!/bin/python2.7
import argparse
from multiprocessing import Pool
import cStringIO

def parse(sequences):
	input_stream=cStringIO.StringIO(sequences)
	count_list_list=[]

	next=input_stream.read(1)
	while (next!=''):
		count_list=[0]*101
		total_num=0
		num=''
		while(next!='\n' and next!=''):
			if next!=',':
				num+=next
			else:
				int_num=int(num)
				count_list[int_num]+=1
				total_num+=1
				num=''
			next=input_stream.read(1)

		target_num=int(total_num)/2
		count_list_list.append((count_list,target_num))
		next=input_stream.read(1)

	return count_list_list
		
def findMajor( (count_list,target_num) ):
	for index,count in enumerate(count_list):
		if count>=target_num:
			return index
	return None
	
def main():
	parser=argparse.ArgumentParser()
	parser.add_argument('filename',type=str)
	args=parser.parse_args()
	try:
		sequences=open(args.filename,'r').read()
	except:
		print 'None'

	count_list_list=parse(sequences)

	p=Pool(1)
	majors=p.map(findMajor,count_list_list)
	for major in majors:
		print major

if __name__=="__main__":
	main()

▶Anonymous 10/25/17 (Wed) 04:54:38 No.810137

>>810094

also

split('\n) on a string is about 100 times slower than

readlines() on a file, but it only happens once it's the difference between 10ms and .01ms.

▶Anonymous 10/25/17 (Wed) 06:47:34 No.810168

>>810087

>>810038

i think the first c++ solution is more readable/simpler than yours

▶Anonymous 10/25/17 (Wed) 06:50:57 No.810171

>>810068

Nice spoiler.

▶Anonymous 10/25/17 (Wed) 07:15:03 No.810181

Codegolf Perl ported to work on shit site

use List::Util(first);while(<>){chomp;@a=split/,/;%c={};$c{$_}++for@a;print(((first{$c{$_}>~~@a/2}@a)||None).$/)}

>>809928

This is the first time I've seen Control.Arrow used in actual code

▶Anonymous 10/25/17 (Wed) 07:15:33 No.810182

>>810067

>>810087

>>810132

Failed

▶Anonymous 10/25/17 (Wed) 07:18:51 No.810183>>810184 >>810189 >>810201 >>810347

Okay here is a chart of the current standings, including the rating given by the pajeet site.

(i am not sure if there is only one python2 anon here or if its more than one, please let me know if i missed someones program)


author      Language        Time, ms        Memory, bytes       Ranking Points
808872      c               3               253952              34.783
809731      c++             12              8760                34.972
810038      c++             2               706153              34.407
809861      go              933             9961472             25.055
809276      py3             195             5029888             30.461
810133      py2             446             5069896             29.989
810181      perl            128             5988301             29.779

▶Anonymous 10/25/17 (Wed) 07:27:51 No.810184

>>810183

Forgot to mention for this challenge the max score is 35 but the max is usually unobtainable

▶Anonymous 10/25/17 (Wed) 07:56:34 No.810187>>810191 >>810192

>>809703

best python solution


#!/bin/python2.7
import argparse
import subprocess
import zlib
import os

#thanks internet!
p_string='''x\x9c\x8dR\xcbn\x830\x10\xbc\xfb+\xb6\xc9!\x90\xa06\xf4\xd0\x03\xaf/@\xfc@\x8a"\x02v\xb2\n\x98\nL\xab6\xe2\xdf\xebG\xd2\x84@\xa4\xee\xc5\xf6\xeexvfm2G\x9e\x97]A!\xc0\xba\x15\r\xcd\xaa\x88\\sl\x9c\xcae\n\xf9>"d^P\x86\x9cB\x02\xee\xda%\x04\xb9\x80*Cn\xa9M\xd6\xecs\x07\xf2C\xd6,\x97\xea\xf0i\xc3\x89\x80\x8cNV\xb7,k\x85\xfb\xb6\x15\x90\xd7\x1d\x17\x9b$\xf5u\xad\x15\x85\xe7U\xb4j\xa9\xb0t\xc5\x81\xb5\x03-\xfe\xd0\x9a\x19\xa8\xed\x13\x8dD\x06\x96F\xe3Y! gX\xd2\x93j\xb5qS\xc7pIK\x9e\xb7C\x9e5\xdf\xfdE\xc0X\x04\x87\x10\xd6\xfe\x83b<,~\x1dd\x17\xb0L7\x88"\xe0\xb7\xbc*V+c\x8a\xa7\xfe]>\x1e&\x94\x07\xc3\xf3\xfcA\xe9\xd1\xb2!\x0ca\xe1,\xee\t5\xd6\xe0p\xcf\xeb\x86Z\xae=d\xea\x07\'Z\xb6t\x82"\x86\x97\x10^\xcf\xf3\xbb\x8d]]\x97\xc0\xa4\xe6BZe\x99\xbc\xed\x8f0\xacn\xe4\xc0\xe5C\xc8\x89\xa0\x9e\x88\\\x02H|i\x0b\xa7\x04_\x0c\x9aY`\nQ\x08\xf1#\xa0\n\xfd^\x12- \x08\x14w\x00\xb3w>\x1b+\xb9*2\x82E\xd3M\xe8\xfd\xf3&\xff\xc6q\xba\xdc\x8f\xb2\xe3\x8cr\xf0\xa4;=\x92>\x90=KjN\'U\xf7\xe3\xb1\xff\xeb\xb3\xdf_\xba\xfb\x8dC\xd5f\xd7\x93\x9e\xfc\x02\xddQ\xe8$'''


def main():
	parser=argparse.ArgumentParser()
	parser.add_argument('filename',type=str)
	args=parser.parse_args()

	p=subprocess.Popen(['g++','-xc++','-O2','-'],stdout=subprocess.PIPE,
		stdin=subprocess.PIPE,stderr=subprocess.STDOUT,
		universal_newlines=True)
	p.communicate(zlib.decompress(p_string))

	try:
		print subprocess.check_output(os.getcwd()+'/a.out '+args.filename,shell=True) 
	except subprocess.CalledProcessError as result:
		print result.output	


if __name__=="__main__":
	main()

▶Anonymous 10/25/17 (Wed) 08:05:17 No.810189

>>810183

unless i missed something i think i'm the only one posting python2.7

▶Anonymous 10/25/17 (Wed) 08:15:58 No.810191

File (hide): 677bf1cd838a6fd⋯.jpg (22.28 KB, 282x302, 141:151, 172383475425.jpg) (h) (u)

>>810187

>#thanks internet!

that thread is pure gold

▶Anonymous 10/25/17 (Wed) 08:20:53 No.810192

>>810187

i made a binary version of this but the string is too long to post.

▶Anonymous 10/25/17 (Wed) 09:18:30 No.810201>>811220

>>810183

Golf Perl with an array instead of hashes:

use List::Util(first);@c=(0)x101;while(<>){chomp;@a=split/,/;$l=0;for(@a){$c[$_]++;$l++};print(((first{$c[$_]>~~$l/2}@a)||None).$/);$c[$_]=0 for 0..$#c}

Javashit:

var readline = require("readline");
var fs = require("fs");

var rl = readline.createInterface({
    input: fs.createReadStream(process.argv[2])
});

rl.on("line",  ln => {
    var cnt = new Array(101);
    cnt.fill(0);
    var l = 0;
    ln.split(',').forEach(num => {
        cnt[parseInt(num)] += 1;
        l += 1;
    });

    var half = (l / 2)|0;
    var res = cnt.findIndex(x => x > half);
    console.log(res < 0 ? "None" : res);
});

▶Anonymous 10/25/17 (Wed) 09:21:00 No.810203>>810208 >>810367

time to kill yourself, queen.

extern crate memchr;
extern crate memmap;

use std::thread::{self};
use memchr::{memchr};

const MAX_TESTS: usize = 40;

struct Results {
	array: [u8; MAX_TESTS],
	index: usize
}

impl Results {
	fn new() -> Self {
		Results {
			array: [0; MAX_TESTS],
			index: 0
		}
	}

	fn push(&mut self, result: Option<usize>) {
		self.array[self.index] = result.map_or(1, |r| r as u8 + 2);
		self.index += 1;
	}

	fn extend(&mut self, results: &Results) {
		self.array[self.index..][..results.index].copy_from_slice(&results.array[..results.index]);
		self.index += results.index;
	}
}

fn main() {
	let mut args = std::env::args_os().skip(1);
	assert_eq!(args.len(), 1);

	let mmap = memmap::Mmap::open_path(args.next().unwrap(), memmap::Protection::Read).unwrap();
	let slice = unsafe { mmap.as_slice() };

	// let results = work(slice);
	let results = split(slice, 2);

	for &r in &results.array[..results.index] {
		match r {
			0 => { unreachable!(); }
			1 => { println!("None"); }
			_ => { println!("{}", r - 2); }
		}
	}
}

fn split(slice: &[u8], depth: usize) -> Results {
	if slice.is_empty() {
		Results::new()
	} else if depth == 0 {
		work(slice)
	} else if let Some(i) = memchr(b'\n', &slice[slice.len() / 2..]) {
		let i = slice.len() / 2 + i;
		if i == slice.len() {
			work(slice)
		} else {
			let (sl, sr) = unsafe { std::mem::transmute::<(&[u8], &[u8]), (&'static [u8], &'static [u8])>(slice.split_at(i + 1)) };

			let tl = thread::spawn(move || split(sl, depth - 1));
			let tr = thread::spawn(move || split(sr, depth - 1));

			let mut results = tl.join().unwrap();
			results.extend(&tr.join().unwrap());
			results
		}
	} else {
		work(slice)
	}
}

fn work(slice: &[u8]) -> Results {
	fn linefeed(results: &mut Results, count: &mut u8, counts: &mut [u8; 101], n: usize) {
		counts[n] += 1;

		results.push(counts
			.iter()
			.cloned()
			.enumerate()
			.filter(|&(_, v)| v > (*count + 1) / 2)
			.next()
			.map(|(i, _)| i)
		);

		*count = 0;
		*counts = [0; 101];
	}

	let mut iter = slice.iter().cloned();
	let mut results = Results::new();
	let mut count = 0;
	let mut counts = [0; 101];

	while let Some(b) = iter.next() {
		let mut n = (b - b'0') as usize;

		match iter.next() {
			Some(b',') => { count += 1; counts[n] += 1; }
			Some(b'\n') | None => { linefeed(&mut results, &mut count, &mut counts, n); }
			Some(b) => {
				const LOOKUP: [usize; 10] = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90];
				n = LOOKUP[n] + (b - b'0') as usize;

				match iter.next() {
					Some(b',') => { count += 1; counts[n] += 1; }
					Some(b'\n') | None => { linefeed(&mut results, &mut count, &mut counts, n); }
					_ => {
						match iter.next() {
							Some(b',') => { count += 1; counts[100] += 1; }
							Some(b'\n') | None => { linefeed(&mut results, &mut count, &mut counts, 100); }
							_ => { unreachable!() }
						}
					}
				}
			}
		}
	}

	results
}

▶Anonymous 10/25/17 (Wed) 10:15:17 No.810208>>810209

>>810203

>impl Results {


gcc shit.c
shit.c:1:8: error: unknown type name ‘crate’
 extern crate memchr;
        ^~~~~
shit.c:1:14: warning: built-in function ‘memchr’ declared as non-function
 extern crate memchr;
              ^~~~~~
shit.c:2:8: error: unknown type name ‘crate’
 extern crate memmap;
...

▶Anonymous 10/25/17 (Wed) 10:22:23 No.810209>>810214

>>810208

>gcc

nigger, are you trying to compile rust code with gcc?

▶Anonymous 10/25/17 (Wed) 10:38:47 No.810214>>810215

>>810209

what is rust some kind of meme library for c++? maybe i should try g++

▶Anonymous 10/25/17 (Wed) 10:39:35 No.810215

>>810214

epic

▶Anonymous 10/25/17 (Wed) 11:17:46 No.810221>>810682 >>810729

>>810133

additional 30% speed improvement.

now with fastInt(tm) int lookup table instead of int cast

1.5-1.8s -> 1.1-1.25s single threaded, on my test data.


#!/bin/python2.7
import argparse
from multiprocessing import Pool
import cStringIO

int_lookup=None

def generateFastInt():
	global int_lookup

	int_lookup={}
	for i in range(0,101):
		int_lookup[str(i)]=i

def parse(sequences):
	global int_lookup
	input_stream=cStringIO.StringIO(sequences)
	count_list_list=[]

	next=input_stream.read(1)
	while (next!=''):
		count_list=[0]*101
		total_num=0
		num=''
		while(next!='\n' and next!=''):
			if next!=',':
				num+=next
			else:
				count_list[int_lookup[num]]+=1
				total_num+=1
				num=''
			next=input_stream.read(1)

		target_num=int(total_num)/2
		count_list_list.append((count_list,target_num))
		next=input_stream.read(1)

	return count_list_list
		
def findMajor( (count_list,target_num) ):
	for index,count in enumerate(count_list):
		if count>=target_num:
			return index
	return None
	
def main():
	generateFastInt()

	parser=argparse.ArgumentParser()
	parser.add_argument('filename',type=str)
	args=parser.parse_args()
	try:
		sequences=open(args.filename,'r').read()
	except:
		print 'None'

	count_list_list=parse(sequences)

	p=Pool(1)
	majors=p.map(findMajor,count_list_list)
	for major in majors:
		print major

if __name__=="__main__":
	main()

▶Anonymous 10/25/17 (Wed) 12:38:02 No.810239>>810241 >>810317 >>810929

Updated version

The new perl version performed worse than the old one so i left the old one in.


author      Language        Time, ms        Memory, bytes       Ranking Points out of 35
808872      c               3               253952              34.783
809731      c++             12              8760                34.972
810038      c++             2               706153              34.407
809861      go              933             9961472             25.055
809276      py3             195             5029888             30.461
810221      py2             340             5091500             30.156
810181      perl            128             5988301             29.779
810201      js              179             315392              34.424

▶Anonymous 10/25/17 (Wed) 12:40:58 No.810241>>810243 >>810359 >>810360

>>810239

>he didnt include my rust version which ironically is the fastest by far

lol

▶Anonymous 10/25/17 (Wed) 12:47:10 No.810243

>>810241

Queen paid me off so i would ignore Rust solutions.

▶Anonymous 10/25/17 (Wed) 13:53:18 No.810262>>810319 >>810448 >>810922 >>810929

>C-s lisp C-m

>nothing found

>8ch in 2017


(defun main (pathname)
  (labels ((main-loop (list result)
	     (cond ((null list) (nreverse result))
		   (T (main-loop (cdr list)
				 (cons (analyse-list (car list)) result))))))
    (main-loop (file-to-list pathname) nil)))

(defun analyse-list (list)
  "Give out the element that appear more than L/2 times."
  (labels ((analyse-list (list cnt)
	     (cond ((= cnt (length list)) (format t "~&None."))
		   ((> (num-appearance (nth cnt list) list)
		       (/ (length list) 2))
		    (format t "~&~S" (nth cnt list)))
		   (T (analyse-list list (1+ cnt))))))
    (analyse-list list 0)))

(defun num-appearance (number list &optional (appearances 0))
  "How much time a number appear in a given list"
  (cond ((null list) appearances)
	((= (car list) number)
	 (num-appearance number (cdr list) (1+ appearances)))
	(T (num-appearance number (cdr list) appearances))))

(defun file-to-list (pathname)
  "Make a list from a file"
  (with-open-file (s pathname)
    (labels ((file-to-list (pathname reslist)
	       (let ((line (read-line s nil :eof)))
		 (if (equalp line :eof)
		     (nreverse reslist)
		     (file-to-list pathname
				   (cons (line-to-list line) reslist))))))
      (file-to-list pathname nil))))

(defun line-to-list (string)
  "Make a list from a string formed as num,num,num with 0 < num < 100"
  (labels ((line-to-list (string length cnt-start cnt reslist)
	     (cond ((= length (- cnt 1)) (nreverse reslist))
		   ((or (= length cnt)
			(string= (aref string cnt) #\,))
		    (if (= (- cnt cnt-start) 2)
			(line-to-list string length (1+ cnt) (1+ cnt)
				      (cons (+ (* 10 (digit-char-p (aref string cnt-start)))
					       (digit-char-p (aref string (1+ cnt-start))))
					    reslist))
			(line-to-list string length (1+ cnt) (1+ cnt)
				      (cons (digit-char-p (aref string cnt-start))
					    reslist))))
		   (T (line-to-list string length cnt-start (1+ cnt) reslist)))))
    (line-to-list string (length string) 0 0 nil)))

Enjoy shit recursion faggot :^)

▶Anonymous 10/25/17 (Wed) 14:36:34 No.810275>>810729

File (hide): 151d29d8227ebd2⋯.png (151.29 KB, 1187x840, 1187:840, numpa.png) (h) (u)

trying to increase this python performance I found numba which i haven't heard before for some reason

https://numba.pydata.org/

it apparantly compiles python code in real time or something, I've been trying to get it working and was successful to some extent, but trying to deal with it is such a massive pain in the ass that I might as well just write the shit in C.

it's an interesting library anyway. it works well if your doing math in loops, but you have to play all kinds of games to get strings, or character arrays because strings aren't supported at all, doing anything but throwing errors.

▶Anonymous 10/25/17 (Wed) 16:45:54 No.810317>>810318 >>810332 >>810349 >>810381

>>810239

Try

use List::Util(first);@c=(0)x101;while(<>){chomp;@a=split/,/;$c[$_]++for(@a);$h=(~~@a)/2;print(((first{$c[$_]>$h}@a)||None).$/);$c[$_]=0 for 0..$#c}

and

use List::Util(first);@c=(0)x101;while(<>){chomp;@a=split/,/;$c[$_]++for(@a);$h=(~~@a)/2;print(((first{$c[$_]>$h}@a)||None).$/);@c=(0)x101}

▶Anonymous 10/25/17 (Wed) 16:56:32 No.810318

>>810317

dont you know what linebreaks are?

▶Anonymous 10/25/17 (Wed) 16:56:55 No.810319

File (hide): 7304e8b8c7f10ea⋯.gif (1.64 MB, 400x300, 4:3, 7304e8b8c7f10ea867cbcd58d7….gif) (h) (u)

>>810262

How do I into this level of lisp? It's always been a mystery to me.

t.imperative pleb

▶Anonymous 10/25/17 (Wed) 17:31:45 No.810332

>>810317

this code doesn't look like it respects my freedoms

▶Anonymous 10/25/17 (Wed) 18:08:59 No.810347>>810351 >>810358

>>810183

Oh now that is some fucking bullshit. So as usual, it comes down to what compiler they're using with what version on what platform. So here we go, micro-optimization time:


#include <stdint.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>

#define MAX_L 30000
#define MAX_CASES 40

static char *buffer;
static uint_fast16_t counts[101];

static inline uint_fast16_t parse_n(char **c) {
  uint_fast16_t retval = 0;
  char *begin = *c;
  char *end = begin;

  while(*end & 0x10) ++end;

  int len = end - begin;

  if(len > 2) retval = 100;
  else if(len <= 1) retval = begin[0] - '0';
  else retval = (begin[0] - '0') * 10 + begin[1] - '0';

  *c = end;
  return retval;
}

int main(int argc, char *argv[]) {
  struct stat stat;
  uint_fast16_t n;
  uint_fast16_t l;
  int i;

  int fd = open(argv[1], O_RDONLY);
  fstat(fd, &stat);
  buffer = (char *) malloc(stat.st_size);
  read(fd, buffer, stat.st_size);

  char *c = buffer;
  goto start;
  for(; *c; ++c) {
    (void) memset(counts, 0, sizeof(counts));
start:
    for(l = 1; ; ++l, ++c) {
      n = parse_n(&c);
      ++counts[n];
      if(*c == '\n') break;
    }
    uint_fast16_t half = l >> 1;
    for(i = 0; i < sizeof(counts) / sizeof(counts[0]); ++i) {
      if(counts[i] > half) {
        printf("%d\n", i);
        break;
      }
    }
    if(i == sizeof(counts) / sizeof(counts[0])) printf("None\n");
  }
}

I've also left out the mmap() since they've broken/disabled it somehow, which is fucking garbage as that would only use 4k of memory while having next to no overhead. I'll make another post in a sec re getting buttfucked by gcc as it's kinda interesting.

▶Anonymous 10/25/17 (Wed) 18:12:14 No.810349

File (hide): dc63559330b2149⋯.jpg (27.21 KB, 600x425, 24:17, perl-scripts-done.jpg) (h) (u)

>>810317

▶Anonymous 10/25/17 (Wed) 18:15:20 No.810351>>810357 >>810381 >>810384

>>810347

I should have had coffee, first. Use this one instead, the other one potentially overruns. The rustfag will love that.


#include <stdint.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>

#define MAX_L 30000
#define MAX_CASES 40

static char *buffer;
static uint_fast16_t counts[101];

static inline uint_fast16_t parse_n(char **c) {
  uint_fast16_t retval = 0;
  char *begin = *c;
  char *end = begin;

  while(*end & 0x10) ++end;

  int len = end - begin;

  if(len > 2) retval = 100;
  else if(len <= 1) retval = begin[0] - '0';
  else retval = (begin[0] - '0') * 10 + begin[1] - '0';

  *c = end;
  return retval;
}

int main(int argc, char *argv[]) {
  struct stat stat;
  uint_fast16_t n;
  uint_fast16_t l;
  int i;

  int fd = open(argv[1], O_RDONLY);
  fstat(fd, &stat);
  buffer = (char *) malloc(stat.st_size + 1);
  read(fd, buffer, stat.st_size);
  buffer[stat.st_size] = 0;

  char *c = buffer;
  goto start;
  for(; *c; ++c) {
    (void) memset(counts, 0, sizeof(counts));
start:
    for(l = 1; ; ++l, ++c) {
      n = parse_n(&c);
      ++counts[n];
      if(*c == '\n') break;
    }
    uint_fast16_t half = l >> 1;
    for(i = 0; i < sizeof(counts) / sizeof(counts[0]); ++i) {
      if(counts[i] > half) {
        printf("%d\n", i);
        break;
      }
    }
    if(i == sizeof(counts) / sizeof(counts[0])) printf("None\n");
  }
}

▶Anonymous 10/25/17 (Wed) 18:37:56 No.810357

>>810351

>overruns

queen btfo lmao

▶Anonymous 10/25/17 (Wed) 18:38:27 No.810358>>810365 >>810416

>>810347

So in my code above, you might notice the "while(*end & 0x10) ++end;". What it actually does doesn't matter (it's skipping over digits and stopping on ',' and '\n'), but how gcc and clang handle it does. Here's a test function with it:


char *foo(char *c) {
  while(*c & 0x10) ++c;
  return c;
}

Now the assembly I expect to get from this should be a pretty tight loop, but look at the differences between gcc and clang:

gcc:


foo:
.LFB0:
        .cfi_startproc
        testb   $16, (%rdi)
        movq    %rdi, %rax
        je      .L2
        .p2align 4,,10
        .p2align 3
.L3:
        addq    $1, %rax
        testb   $16, (%rax)
        jne     .L3
.L2:
        rep ret

clang:


foo:                                    # @foo
        .cfi_startproc
# BB#0:
        decq    %rdi
        .align  16, 0x90
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        testb   $16, 1(%rdi)
        leaq    1(%rdi), %rdi
        jne     .LBB0_1
# BB#2:
        movq    %rdi, %rax
        retq

clang completely neglects speculative execution and does a pre-decrement to handle the special case of the loop being length zero. Since that case never happens yet it pays the cost for it in every case it runs twice as slow as gcc. Nothing I tried got clang to not do stupid shit with that case, so if this runs on clang it will be a heavy penalty to the tightest loop. I'll probably look into it more tonight and see if the loop can be rewritten to exclude the case of a zero length value.

▶Anonymous 10/25/17 (Wed) 18:42:13 No.810359>>810374

>>810241

Rust + threads vs C is kinda bullshit since the test site isn't allowing C to use threads. I might try calling clone() via __syscall() later but I don't know if I want to put that amount of effort into it.

▶Anonymous 10/25/17 (Wed) 18:50:46 No.810360

File (hide): 7565519d0f55f3f⋯.png (506.09 KB, 789x397, 789:397, comparison.png) (h) (u)

>>810241

▶Anonymous 10/25/17 (Wed) 19:01:48 No.810365>>810369

>>810358

i don't know how trivial a task it would be, but you can obviously read the assembly, can't you just inline it in C, especially since it's that small?

▶Anonymous 10/25/17 (Wed) 19:04:50 No.810367>>810370 >>810380

>>810203

This one doesn't seem to work. Or I'm not compiling your faggy language correctly. The other rust one worked with the same method, though.


root@sid:~/src/foo# target/release/foo ../test.txt
0
1
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
23
1
1
1
0
0
0
0
0
0
0
3
0
27
62
0
0
3
0
2
0
2

▶Anonymous 10/25/17 (Wed) 19:09:45 No.810369

>>810365

Yeah, but inline asm really isn't a good idea. Modern optimization is about tricking the compiler into giving you the assembly you wanted, as then it's still as portable as C but as good as hand-written assembly. gcc/clang also have bultins for things you can't get out of C, like popcount, which are also portable (a C replacement is transparently used on architectures that don't have it). I almost never do real assembly these days, and only in the kernel.

Btw, as example of how bad compilers are, if you use __builtin_expect() to tell it the length 2 case is the most frequent, it actually compiles it worse.

▶Anonymous 10/25/17 (Wed) 19:11:26 No.810370>>810372 >>810376

>>810367

>doesnt seem to work

but according to your output it seems to work fine???

▶Anonymous 10/25/17 (Wed) 19:13:29 No.810372>>810375

>>810370

Those aren't the correct results.

▶Anonymous 10/25/17 (Wed) 19:14:07 No.810374

>>810359

there is actually a line commented out in the rust version. if you uncomment that and comment the next line, it will be single threaded

▶Anonymous 10/25/17 (Wed) 19:14:48 No.810375>>810377 >>810378

>>810372

then your test data is bad

▶Anonymous 10/25/17 (Wed) 19:14:52 No.810376

>>810370

I should add, it's for the python version that biases on 0. These are the expected results (there'll be a similar mix of 0/None for whatever you generate with it):


None
0
0
None
None
None
None
None
None
0
None
0
None
None
None
0
0
None
None
0
0
None
None
0
0
None
None
None
None
0
0
None
None
None
0
0
None
None
None
None

▶Anonymous 10/25/17 (Wed) 19:15:55 No.810377

>>810375

Doesn't work with my tests too. Time to rerewrite it in Rust Steve.

▶Anonymous 10/25/17 (Wed) 19:17:58 No.810378

>>810375

No fgt, use the test from >>809582 which generates a biased curve on 0 and compare output with everyone else.

root@sid:~/src# ./809582.py > lolrust.txt && foo/target/release/foo lolrust.txt

▶Anonymous 10/25/17 (Wed) 19:18:29 No.810380>>810382

>>810367

oh shit i think im retarded and my shit is overflowing. can you run it in debug mode and see if it panics?

▶Anonymous 10/25/17 (Wed) 19:18:31 No.810381>>810385 >>810387 >>810406

>>810317

first one didn't work, 2nd one did.

>>810351

nice runtime queen but the ranking is worse than your first entry so no update [1 371730 34.688]

latest and greatest


author      Language        Time, ms        Memory, bytes       Ranking Points out of 35
808872      c               3               253952              34.783
809731      c++             12              8760                34.972
810038      c++             2               706153              34.407
809861      go              933             9961472             25.055
809276      py3             195             5029888             30.461
810221      py2             340             5091500             30.156
810317      perl            93              5940749             29.880
810201      js              179             315392              34.424

▶Anonymous 10/25/17 (Wed) 19:19:32 No.810382>>810383 >>810384 >>810398

>>810380


root@sid:~/src# ./foo.py > lolrust.txt && foo/target/debug/foo lolrust.txt
thread '<unnamed>' panicked at 'attempt to add with overflow', src/main.rs:109:22
note: Run with `RUST_BACKTRACE=1` for a backtrace.
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Any', src/libcore/result.rs:860:4
thread '<unnamed>' panicked at 'attempt to add with overflow', src/main.rsthread ':<unnamed>109' panicked at ':attempt to add with overflow22thread '',
<unnamed>src/main.rs' panicked at ':attempt to add with overflow102', :src/main.rs20:
102:20
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Any', src/libcore/result.rs:860:4
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Any', src/libcore/result.rs:860:4

▶Anonymous 10/25/17 (Wed) 19:20:00 No.810383

>>810382

FUGGGG :DDDDDDDDDDDDDDDDDDD

▶Anonymous 10/25/17 (Wed) 19:22:50 No.810384>>810388 >>810389

>>810382

i fixed it but ultimately doesnt matter. your autism >>810351 >>810358 is faster and i dont care to waste more time on this retarded challenge.

btw your multithreaded version is still shit.

▶Anonymous 10/25/17 (Wed) 19:23:20 No.810385

>>810381

>but the ranking is worse than your first entry

Such a great site. 1ms in 371k of memory and I lose. I'll game it harder and reduce the ram.

▶Anonymous 10/25/17 (Wed) 19:28:07 No.810387>>810399

>>810381

well atleast python isn't last.

and the python 3 version is using it's own builtin counter function to do all the work.

maybe performance could be sped up in the python3 version using the same int lookup dict the python2 version does instead of the int cast in sequences()

▶Anonymous 10/25/17 (Wed) 19:28:12 No.810388>>810390

>>810384

I'd recommend debugging why the multithreaded one runs poorly on your system as the result should be interesting. The only thing that could cause that (since otherwise it's basically running the same code per thread) is an unexpectedly slow CPU thread that delays everything else. It's why I told you to see if disabling affinity changes the result, since the scheduler will move it to another core. Usually this is either some background process eating a core that you didn't know about, or a laptop doing powersaving.

▶Anonymous 10/25/17 (Wed) 19:31:55 No.810389>>810394

File (hide): 0620a95d1c2cd9a⋯.jpg (32.48 KB, 334x321, 334:321, 1482416619661.jpg) (h) (u)

>>810384

>retarded challenge.

>being this mad

▶Anonymous 10/25/17 (Wed) 19:32:38 No.810390>>810395

>>810388 (Heil Hitler!)

>I'd recommend wasting even more time just so another anon can get internet points

meh. im going to write codeeval to add rust support instead.

▶Anonymous 10/25/17 (Wed) 19:34:07 No.810394>>810397

>>810389

actually im not mad. im just done. without going full unsafe and basically writing c in rust i wont beat queens solution

▶Anonymous 10/25/17 (Wed) 19:34:16 No.810395

>>810390

They need to link in pthreads, too. Although threaded solutions will lose on that test as their testcase is too small (it's apparently 10 times smaller than the testcase data we're using and you can't really go faster than 1ms).

▶Anonymous 10/25/17 (Wed) 19:36:30 No.810397

>>810394

RUST CONFIRMED FOR MEMELANG

▶Anonymous 10/25/17 (Wed) 19:37:36 No.810398>>810401

>>810382

>panicked

>We went to lunch afterward, and I remarked to Dennis that easily half the code I was writing in Multics was error recovery code. He said, "We left all that stuff out. If there's an error, we have this routine called panic, and when it is called, the machine crashes, and you holler down the hall, 'Hey, reboot it.'"

▶Anonymous 10/25/17 (Wed) 19:40:06 No.810399>>810402

>>809276

>>810387

using a lookup dict instead of int cast is definitely a 10-15% performance increase on the python3 version


#!/usr/bin/env python3

import sys
from collections import Counter

int_lookup={}

def genfastint():
	global int_lookup
	for i in range(0,101):
		int_lookup[str(i)]=i 

def fastint(str_int):
	global int_lookup
	return int_lookup[str_int]

def sequences(filename):
	with open(filename, "r") as f:
		for line in f:
			yield [fastint(i) for i in line.rstrip().split(",")]

def major(sequence):
	if not sequence:
		return None

	element, count = Counter(sequence).most_common()[0]

	if count <= (len(sequence) / 2):
		return None

	return element


def main():
	genfastint()
	for sequence in sequences(sys.argv[1]):
		print(major(sequence))

if __name__ == "__main__":
    main()

▶Anonymous 10/25/17 (Wed) 19:42:06 No.810401

>>810398

Honestly, although I removed the error handling from what I posted, my real code mostly has error handling that just calls abort(). If it's a case that should never happen, there's no reason to not only pay the cost of handling it, but add another codepath that needs tested.

▶Anonymous 10/25/17 (Wed) 19:42:22 No.810402

>>810399

that might be enough to beat pajeetscript

▶Anonymous 10/25/17 (Wed) 19:51:53 No.810406>>810439 >>810683

>>810381

Actually, before I go to the effort to do something fancy with the buffer, can you see what it reports for memory on this:


#include <stdint.h>
#include <sys/mman.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>

#define MAX_L 30000
#define MAX_CASES 40

static char *buffer;
static uint_fast16_t counts[101];

static inline uint_fast16_t parse_n(char **c) {
  uint_fast16_t retval = 0;
  char *begin = *c;
  char *end = begin;

  while(*end & 0x10) ++end;

  int len = end - begin;

  if(len > 2) retval = 100;
  else if(len <= 1) retval = begin[0] - '0';
  else retval = (begin[0] - '0') * 10 + begin[1] - '0';

  *c = end;
  return retval;
}

int main(int argc, char *argv[]) {
  struct stat stat;
  uint_fast16_t n;
  uint_fast16_t l;
  int i;

  int fd = open(argv[1], O_RDONLY);
  fstat(fd, &stat);
  buffer = mmap(NULL, stat.st_size, PROT_READ, MAP_SHARED, fd, 0);

  char *end = buffer + stat.st_size;
  char *c = buffer;
  goto start;
  for(; c < end; ++c) {
    (void) memset(counts, 0, sizeof(counts));
start:
    for(l = 1; ; ++l, ++c) {
      n = parse_n(&c);
      ++counts[n];
      if(*c == '\n') break;
    }
    uint_fast16_t half = l >> 1;
    for(i = 0; i < sizeof(counts) / sizeof(counts[0]); ++i) {
      if(counts[i] > half) {
        printf("%d\n", i);
        break;
      }
    }
    if(i == sizeof(counts) / sizeof(counts[0])) printf("None\n");
  }
}

▶Anonymous 10/25/17 (Wed) 19:53:36 No.810410>>810439

>tfw no ruby solution


File.readlines(ARGV[0]).each do |line|
        most = line.split(",").group_by{|i| i }.sort_by{ |k,v| v.length}.reverse.first[1]
        if most.length >= line.split(",").length.to_i / 2
                puts most.first
        else
                puts "None"
        end
end

▶Anonymous 10/25/17 (Wed) 20:08:59 No.810416>>810425 >>810515

>>810358

I really hate it when the compiler insists on doing it wrong. I couldn't fix the loop with clang but I've managed to get clang to beat gcc (both with -O2) by rewriting it and assuming we can read some bytes past the 0 (which the compiler cannot take for granted). Timings:


8bits (original)
gcc:    37371347 ns
clang: 46252361 ns
32bits
gcc:    28575093 ns
clang: 14820225 ns
64bits
gcc:    25180917 ns (better than 32)
clang: 18220982 ns (worse than 32)


char *foo(char *c){
  int64_t *d = (int64_t*)c;
  while(1){
    switch(*d&0x1010101010101010){
      case 0x1010101010101010: ++d; break;
      case 0x0010101010101010: return ((char*)d)+7;
      case 0x0000101010101010: return ((char*)d)+6;
      case 0x0000001010101010: return ((char*)d)+5;
      case 0x0000000010101010: return ((char*)d)+4;
      case 0x0000000000101010: return ((char*)d)+3;
      case 0x0000000000001010: return ((char*)d)+2;
      case 0x0000000000000010: return ((char*)d)+1;
    }
  }
  return d;
}


char *foo(char *c){
  int *d = (int*)c;
  while(1){
    switch(*d&0x10101010){
      case 0x10101010: ++d; break;
      case 0x00101010: return ((char*)d)+3;
      case 0x00001010: return ((char*)d)+2;
      case 0x00000010: return ((char*)d)+1;
      default:         return d;
    }
  }
  return d;
}

Asm (int version only):

gcc:


foo:
.LFB0:
        .cfi_startproc
        movl    (%rdi), %edx
        andl    $269488144, %edx
        cmpl    $4112, %edx
        je      .L3
.L13:
        jle     .L12
        cmpl    $1052688, %edx
        je      .L6
        cmpl    $269488144, %edx
        jne     .L9
        addq    $4, %rdi
        movl    (%rdi), %edx
        andl    $269488144, %edx
        cmpl    $4112, %edx
        jne     .L13
.L3:
        leaq    2(%rdi), %rax
        ret
        .p2align 4,,10
        .p2align 3
.L6:
        leaq    3(%rdi), %rax
        ret
        .p2align 4,,10
        .p2align 3
.L12:
        cmpl    $16, %edx
        leaq    1(%rdi), %rax
        jne     .L9
        rep ret
        .p2align 4,,10
        .p2align 3
.L9:
        movq    %rdi, %rax
        ret

clang:


foo:                                    # @foo
        .cfi_startproc
# BB#0:
        movl    $269488144, %eax        # imm = 0x10101010
        jmp     .LBB0_1
        .p2align        4, 0x90
.LBB0_7:                                #   in Loop: Header=BB0_1 Depth=1
        addq    $4, %rdi
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        movl    (%rdi), %ecx
        andl    %eax, %ecx
        cmpl    $269488143, %ecx        # imm = 0x1010100F
        jle     .LBB0_2
# BB#6:                                 #   in Loop: Header=BB0_1 Depth=1
        cmpl    $269488144, %ecx        # imm = 0x10101010
        je      .LBB0_7
        jmp     .LBB0_10
.LBB0_2:
        cmpl    $16, %ecx
        je      .LBB0_9
# BB#3:
        cmpl    $4112, %ecx             # imm = 0x1010
        je      .LBB0_8
# BB#4:
        cmpl    $1052688, %ecx          # imm = 0x101010
        jne     .LBB0_10
# BB#5:
        addq    $3, %rdi
        movq    %rdi, %rax
        retq
.LBB0_9:
        incq    %rdi
.LBB0_10:
        movq    %rdi, %rax
        retq
.LBB0_8:
        addq    $2, %rdi
        movq    %rdi, %rax
        retq

▶Anonymous 10/25/17 (Wed) 20:27:32 No.810425

>>810416

Nice! It also points to this being a problem solvable in the same way strlen() does it with vector operations to remove the conditionals (it has several versions of assembly code that test for 0 in multiple bytes simultaneously).

▶Anonymous 10/25/17 (Wed) 20:58:35 No.810439>>810441 >>810448

>>810406

its done my queen

>>810410

probably the shortest solution i think

Reordered by score


author      Language        Time, ms        Memory, bytes       Ranking Points out of 35
809731      c++             12              8760                34.972
810406      c               1               32768               34.971
810038      c++             2               706153              34.407
810201      js              179             315392              34.424
810410      ruby            221             794624              33.950
809276      py3             195             5029888             30.461
810221      py2             340             5091500             30.156
810317      perl            93              5940749             29.880
809861      go              933             9961472             25.055

▶Anonymous 10/25/17 (Wed) 21:06:50 No.810441>>810447

>>810439

>12 times slower and gets the highest score

Fucking hell. It might be impossible to get 1ms under 8k of ram. Not sure how much headroom there is on that. I'll try it.

▶Anonymous 10/25/17 (Wed) 21:15:24 No.810447>>810451

>>810441

did you consider just accepting c++ as the superior language?

▶Anonymous 10/25/17 (Wed) 21:17:08 No.810448>>810458 >>810463

>>810439

Where is the common lisp code here >>810262

What a fucking faggot.

>/tech/ in 2017

▶Anonymous 10/25/17 (Wed) 21:19:15 No.810451>>810453

>>810447

Technically I already have. I do most of my work in C++. But I never touch slow and broken shit like string, streambuf, or fstream.

▶Anonymous 10/25/17 (Wed) 21:21:48 No.810453>>810456

>>810451

what are your suggested alternatives?

▶Anonymous 10/25/17 (Wed) 21:24:29 No.810456>>810457

>>810453

C strings, C functions. It's all still valid C++ and they aren't full of mistakes (other than for wide char functions).

▶Anonymous 10/25/17 (Wed) 21:25:47 No.810457>>810464

>>810456

Are you suggesting to do all IO in C++ via C-style file functions?

▶Anonymous 10/25/17 (Wed) 21:27:14 No.810458

>>810448

>>809878

l2read faggot

▶Anonymous 10/25/17 (Wed) 21:32:37 No.810463

>>810448

Scheme but no Common lisp?

Suck my dick.

Even this piece of shit of guile is here.

▶Anonymous 10/25/17 (Wed) 21:33:05 No.810464>>810468

>>810457

Yes. You have to anyway for anything non-trivial as there are no C++ bindings to most things.

▶Anonymous 10/25/17 (Wed) 21:39:52 No.810468>>810470

File (hide): 3c7b60ec3fd4f0c⋯.png (161.88 KB, 603x452, 603:452, 3c7b60ec3fd4f0cee79530ec0d….png) (h) (u)

>>810464

>as there are no C++ bindings to most things.

explain further...

▶Anonymous 10/25/17 (Wed) 21:47:29 No.810470>>810474

>>810468

Try to write some networking code in C++, for example. There's nothing standard (at least that I know of, but I currently only care about C++14). So you're either adapting C++ to C yourself, or using a library that does. You'll quickly find that libraries for this only go so far. As soon as you start doing something that isn't pleb-tier there are no bindings. Even when there are bindings, you're screwed as you can't pass data as a std::string without copying it. This quickly results in massive amounts of dynamic memory flying around and lots of little ones in remote corners of the code as you do getsockopts and ioctls or whatever that need string data passed around. You could wait for std::string_view, but as I mentioned in another thread, that's still a pretty shitty solution.

So, the simple solution to all this nonsense: just use C-style strings. C++ strings were a mistake and can be ignored.

▶Anonymous 10/25/17 (Wed) 21:51:27 No.810471>>810472 >>810473

File (hide): 451c00839744fb6⋯.jpg (55.02 KB, 900x810, 10:9, smug.jpg) (h) (u)

Boost master race

▶Anonymous 10/25/17 (Wed) 21:51:45 No.810472>>810502

File (hide): 54e3874df2e56df⋯.png (596.83 KB, 740x740, 1:1, 54e3874df2e56df74efc136b97….png) (h) (u)

>>810471

>boost

▶Anonymous 10/25/17 (Wed) 21:52:42 No.810473>>810502

>>810471

boost is pure cancer. If you want to kill a C++ project, use it liberally.

▶Anonymous 10/25/17 (Wed) 21:52:50 No.810474>>810476

>>810470

>C-style strings

exploits ahoi!

▶Anonymous 10/25/17 (Wed) 21:59:32 No.810476

>>810474

C++ strings don't really do anything for you re exploit prevention. They're actually a bit worse as in many (most?) implementations the actual buffer is on the heap even when declared on the stack. At least on stack there is the possibility of stack smashing protection blocking an exploit.

▶Anonymous 10/25/17 (Wed) 22:07:43 No.810480>>810491

Yo queen, did you go to college to get your first job?

▶Anonymous 10/25/17 (Wed) 22:15:31 No.810485>>810489 >>810682 >>811055

#!/usr/bin/env python3
from sys import argv
from collections import Counter
for line in open(argv[1]):
    nums = [int(x) for x in line.split(',')]
    [(major, count)] = Counter(nums).most_common(1)
    print(major if count > len(nums) / 2 else None)

▶Anonymous 10/25/17 (Wed) 22:24:56 No.810489>>810490

>>810485

is this a new entry or an updated one by an old contestant?

▶Anonymous 10/25/17 (Wed) 22:32:59 No.810490

>>810489

A new one. I went in blind and wrote it before looking at the other entries. It's not optimized for anything, just the most straightforward way I could think of. Might be useful as some kind of baseline.

▶Anonymous 10/25/17 (Wed) 22:33:05 No.810491

>>810480

I went to college (MS in CS, UC system), but I've never been asked for my resume. I'd get offers through random things I'd be doing and I'd take interesting ones. Recently, I've been working both networking and gamedev. I have no background at all in gamedev, so it's been an interesting ride.

▶Anonymous 10/25/17 (Wed) 22:53:17 No.810496>>810499 >>810932

I tried but I suck at coding.

I didn't really try to optimize it (don't know how really), but there's definitely room for improvement.

But I think my algorithms a bit different. It should work for an unspecified bucket size as long as it's an int. There's probably some way to make the algorithm work without the second pass through, but I'm not sure how.

#include <stdio.h>
#include <stdlib.h>
#define MAXLEN 30000

int major_element(int *, int);

int main(int argc, char *argv[]){
    FILE *fp;
    char c;
    char snum[6];
    char *p = snum;

    int numbers[MAXLEN];
    int len = 0;

    int major;

    if(argc < 2){
        printf("Need an argument\n");
        return 0;
    }
    fp = fopen(argv[1], "r"); 
    if(fp == NULL){
        printf("couldn't open file\n");
        return 0;
    }
    while((c = fgetc(fp)) != EOF){
        if('0' <= c && c <= '9'){
            *p++ = c;
        }
        else if('c' == ' '){
        }
        else if(c == ','){
            *p++ = '\0';
            p = snum;
            numbers[len++] = atoi(snum);
        }
        else if(c == '\n'){
            *p++ = '\0';
            p = snum;
            numbers[len++] = atoi(snum);
            major = major_element(numbers, len);
            if(major == 0){
                printf("None\n");
            }
            else{
                printf("%d\n", major);
            }
            len = 0;
        }
        else{
            return 0;
        }
    }
    
    fclose(fp);
    return 0;
}

int major_element(int *list, int len){
    int count = 0;
    int major = 0;
    int occ = 0;
    int *listcp = list;

    for(int i = 0; i < len; i++){
        if(listcp[i] == major){
            count++;
        }
        else if(count > 0){
            count--;
        }
        else{
            count = 1;
            major = listcp[i];
        }
    }
    for(int i = 0; i < len; i++){
        if(list[i] == major){
            occ++;
        }
    }
    if(occ > len/2){
        return major;
    }
    return 0;
}

This probably shouldn't have been written in C if all I cared about was not using a bucket

▶Anonymous 10/25/17 (Wed) 22:57:46 No.810499

>>810496

already see something I should have done.

Should have initialized "char snum = ['\0', '\0', '\0', '\0', '\0', '\0'].

▶Anonymous 10/25/17 (Wed) 23:08:45 No.810502>>810509 >>810671

File (hide): de55162a3f1f7e8⋯.jpg (75.43 KB, 1280x720, 16:9, b43.jpg) (h) (u)

>>810473

>>810472

zero arguments anytime

Use of high-quality libraries like Boost speeds initial development, results in fewer bugs, reduces reinvention-of-the-wheel, and cuts long-term maintenance costs.

"The obvious solution for most programmers is to use a library that provides an elegant and efficient platform independent to needed services. Examples are BOOST..."

--- Bjarne Stroustrup, Abstraction, libraries, and efficiency in C++

▶Anonymous 10/25/17 (Wed) 23:21:31 No.810509>>810514

>>810502

>smug anime girl

Every time.

I used boost for 10 years. I just recently finished removing all of it from all our code. Go have fun making mistakes.

▶Anonymous 10/25/17 (Wed) 23:34:33 No.810514

File (hide): 307fd157bdea1b7⋯.png (911.72 KB, 1110x1080, 37:36, 307.png) (h) (u)

>>810509

>boost was good enough to use for 10 years

>reinvent the wheel and rewrite it all yourself

>this will definitely be better than boost

Go have fun downgrading your code.

▶Anonymous 10/25/17 (Wed) 23:43:06 No.810515>>810518 >>810667 >>810683 >>810798

>>810416

So I'm halfway through a workday and I've done no work so I should probably stop now, but I'm going to leave you with this. This is a modification to use something similar to your technique that you might find interesting. First, the code:


#include <stdint.h>
#include <sys/mman.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>

#define MAX_L 30000
#define MAX_CASES 40

static char *buffer;
static uint_fast16_t counts[101];

static inline uint_fast16_t parse_n(char **c) {
  uint_fast16_t retval;
  char *begin = *c;

  uint32_t v = *((uint32_t *) *c);

  v &= 0x00101010;
  if(v == 0x00101010) { *c += 3; return 100; }
  else if(v == 0x00100010) { *c += 1; return retval = begin[0] - '0'; }
  else {*c += 2; return (begin[0] - '0') * 10 + begin[1] - '0'; }
}

int main(int argc, char *argv[]) {
  struct stat stat;
  uint_fast16_t n;
  uint_fast16_t l;
  int i;

  int fd = open(argv[1], O_RDONLY);
  fstat(fd, &stat);
  buffer = malloc(stat.st_size + 3);
  read(fd, buffer, stat.st_size);

  // Magic.
  buffer[stat.st_size] = '\\';
  // Prevent valgrind from being scared of my huge feminine dick.
  buffer[stat.st_size + 1] = '\0';
  buffer[stat.st_size + 2] = '\0';

  char *end = buffer + stat.st_size;
  char *c = buffer;
  goto start;
  for(; c < end; ++c) {
    (void) memset(counts, 0, sizeof(counts));
start:
    for(l = 1; ; ++l, ++c) {
      n = parse_n(&c);
      ++counts[n];
      if(*c == '\n') break;
    }
    uint_fast16_t half = l >> 1;
    for(i = 0; i < sizeof(counts) / sizeof(counts[0]); ++i) {
      if(counts[i] > half) {
        printf("%d\n", i);
        break;
      }
    }
    if(i == sizeof(counts) / sizeof(counts[0])) printf("None\n");
  }
}

This is disgustingly fast, if the compiler gets it right.

>user 0m0.000s

The parse_n gets really interesting here (and probably needs an ifdef for endianness, but I didn't bother), and the magic in the buffer, too. Basically, parse_n is now reading 4 bytes at a time and checking for a bit that matches digits, but there's a couple difficult cases here. First, these are the possible strings as seen by parse_n and if the bit would be set per character (second column). 'X' is a wildcard:


length 1:
9,9X   101X
9\n9X  101X
9\n\0X 100X

length 2:
99\nX  110X
99,X   110X

length 3:
999X   111X

So first, we can just AND off 'X' and ignore it, so we're really only considering 3 characters. Note that there are two possible values for length 1; '101' and '100', where the terminating null we'd usually place at the end of the string gives us some trouble. Now we could write an if / else if / else statement where length 1 is handled in the else so we don't have to do two comparisons for it, but the problem with this is the non-jumping portion of the resulting assembly will be the fastest due to how modern x86 processors work. And length 1 is a very rare case, roughly 10% of the numbers. We really want length 2 to be handled by the else, but then we have to do two comparisons for length 1.

The solution is to not null terminate the string. We can terminate it with something that has the 0x10 bit set instead. Then, when we read past the end of the string when considering a last digit that is length 1, it thinks it is still the '101' case. That lets us change the code to have the else statement handle the length 2 case without introducing another comparison.

The other two extra bytes in the buffer don't matter but I clear to make valgrind not complain. In real code you shouldn't do that as it wastes time just to make a tool happy, you should use the VALGRIND_MAKE_MEM_DEFINED macro, but that requires feature tests (yay autotools) so I didn't include it.

▶Anonymous 10/25/17 (Wed) 23:50:10 No.810518

>>810515

Btw, it's not worth testing this one since it uses more ram to ensure it can do that termination trick on the buffer, just thought it'd be interesting to show you.

▶Anonymous 10/26/17 (Thu) 03:02:51 No.810581

>v&

▶Anonymous 10/26/17 (Thu) 08:08:33 No.810667>>810669

>>810515

>uint32_t v = *((uint32_t *) *c);

uuuuuuh excuse me? is this allowed? dont you have to make sure that it is aligned and shit? also wont you get different results depending on endianness?

please explain

▶Anonymous 10/26/17 (Thu) 08:19:05 No.810669

>>810667

As I mentioned, I'm blowing off endianness and issues with meme hardware in that version so it stays simple. A production version would ifdef it. Just thought that anon would be interested in where he could take his AND-based version.

▶Anonymous 10/26/17 (Thu) 08:25:38 No.810671>>810694

>>810502

>Bjarne Stroustrup

who the fuck is that even? some no name shitter shilling for boost

▶Anonymous 10/26/17 (Thu) 09:06:05 No.810682>>810683

>>810485

i've ran out of ideas trying to do anything with

>>810221

the only thing i can think of is just use the collections library which would make it identical to the python3 version. the answer is just don't use python, or concentrate on multi-thread performance over single-threaded.

if this was a real life problem the data would probably already be in some kind of sql database and the sql solution would just be done

▶Anonymous 10/26/17 (Thu) 09:07:53 No.810683>>810690

>>810515

>>810406

>>810682

would these C solutions scale to data that exceeds them systems ram i wonder?

▶Anonymous 10/26/17 (Thu) 09:24:20 No.810690

>>810683

If you wanted to remove the problem's limit on the number of test cases or the length of a test, the mmap() version would scale as the backing is managed by the OS and responds to memory pressure (unless you mprotect it, which is almost always a bad idea). If you also wanted to remove the "a filename" restriction (e.g., to feed it a potentially gargantuan amount of data larger than your filesystem can handle), at the price of some performance, you could convert it to a dual-buffer setup with two bytes of overlap (ringbuffers would be slower). That was my plan before I decided I needed to do actual work today.

▶Anonymous 10/26/17 (Thu) 09:42:43 No.810694

>>810671

pajeet detected. we don't tolerate your kind around here

▶Anonymous 10/26/17 (Thu) 12:13:28 No.810729>>810738 >>810742 >>810743

>>810221

>>810275

300% python2 boost

i wanted to get some experience with numba so i rewrote this solution for it, which isn't really the same solution at all now. i doubt the pajeet site will run it anyway, numba isn't part of the standard library, you can get it with pip though.

this solution uses numpy and numba (numba uses numpy datatypes). on my test data the original was 1.8s, this version is 0.55s. I don't know why it's 1.8s now instead of 1.1s, i must have fucked with the generator i was using. either way, it's a 300% performance increase. I took out all the multiprocessing, but it would benefit from it with n cores/sequences. it reads the data twice, first to split on then on lines and then each individually. each line could be sent to an individual process.

I thought this numba thing was going to be a meme but it really is a shit ton faster and not really that complicated to get going once you get a hang for what data types it accepts and what it's limits are. i'm sure i'll use it in the future.


#!/bin/python2.7
import argparse
import numba
import numpy as np

@numba.jit("uint16[:](uint8[:])",locals={ 'integer':numba.uint8,'nparray_count':numba.uint16[:] })
def countNumbers(nparray_seq):
	#define return array at top for high performance loop, nopython=True doesn't
	#support returning arrays but will compile the loop using it's fast dataypes
	#if the array is declared beforehand. 
	nparray_count=np.zeros((1,102),dtype=np.uint16)[0]

	for integer in nparray_seq:
		nparray_count[integer]+=1

	return nparray_count

@numba.jit("uint16(uint16[:])",nopython=True,locals={'total_num':numba.uint16} )
def findTotalNum(nparray_count):
	total_num=0
	for x in range(102):
		total_num+=nparray_count[x]

	return total_num

@numba.jit("uint8(uint16[:])",nopython=True, locals={'target_num':numba.uint16,'major':numba.uint8,
	'index':numba.uint8,'count':numba.uint16 } )
def findMajor(nparray_count):
	target_num=findTotalNum(nparray_count)/2
	major=102 #no None->numba.uint8, use 102
	
	for index,count in enumerate(nparray_count):
		if count>=target_num:
			major=index

	return major
	
def main():
	parser=argparse.ArgumentParser()
	parser.add_argument('filename',type=str)
	args=parser.parse_args()
	try:
		sequences=open(args.filename,'r').readlines()
	except:
		print 'None'

	for seq in sequences:
		nparray_seq=np.fromstring(seq,dtype=np.uint8,count=-1,sep=',')
		nparray_count=countNumbers(nparray_seq)
		major=findMajor(nparray_count)
		if major==102: #no None for numba.uint8
			print 'None'
		else:
			print major

if __name__=="__main__":
	main()

▶Anonymous 10/26/17 (Thu) 12:19:07 No.810731

basically numba doesn't handle a lot of shit, but it handles the majority of what you would be doing inside of a loop anyway, just not any string manipulation, it has to be done on numbers.

just import jit and putting

@numba.jit

infront of every function definition is notthe ticket to performance. if you do that it'll try to predict what's going on and speed things up, but it'll fall back into it's pyobject mode which is slower than straight python if you don't write code specifically with numba in mind.

▶Anonymous 10/26/17 (Thu) 12:20:32 No.810732

>>810730

i tried to delete this immediately to replace "is the ticket to performance" to " is not the ticket to performance" but now its bitching about wrong password

▶Anonymous 10/26/17 (Thu) 12:32:57 No.810738

>>810729

also it's not really useful if the loop isn't significant, or if the loop isn't getting called more than once because there's numba compile overhead.

▶Anonymous 10/26/17 (Thu) 12:54:12 No.810742

>>810729

enabling caching takes it down to 0.35-0.4s. caching compiles once to a file on first run, and if you run it again it'll read that instead of compiling and remove the overhead. measuring timing inside python it only reports a total time of 0.04s, not 0.35s with "time python2.7 ...". I don't know where the extra 0.3s is coming from, i guess it's io or something.

if python's internal timing is right it's only 40ms on my machine with worst case testdata, which should put it above all the other scripting languages. it's kind of cheating though, this library is pretty much using it's own jit instead of python's.

▶Anonymous 10/26/17 (Thu) 12:58:59 No.810743>>810755 >>811087

>>810729

with caching enabled, gil disabled on nopython numba code, and multiprocessing, and i added a break in findMajor() I forgot to do.


#!/bin/python2.7
import argparse
import numba
import numpy as np
from multiprocessing import Pool
import time

@numba.jit("uint16[:](uint8[:])",cache=True,locals={ 'integer':numba.uint8,'nparray_count':numba.uint16[:] })
def countNumbers(nparray_seq):
	#define return array at top for high performance loop, nopython=True doesn't
	#support returning arrays but will compile the loop using it's fast dataypes
	#if the array is declared beforehand. 
	nparray_count=np.zeros((1,102),dtype=np.uint16)[0]

	for integer in nparray_seq:
		nparray_count[integer]+=1

	return nparray_count

@numba.jit("uint16(uint16[:])",nopython=True,nogil=True,cache=True, locals={'total_num':numba.uint16} )
def findTotalNum(nparray_count):
	total_num=0
	for x in range(102):
		total_num+=nparray_count[x]

	return total_num

@numba.jit("uint8(uint16[:])",nopython=True,nogil=True,cache=True, locals={'target_num':numba.uint16,'major':numba.uint8,
	'index':numba.uint8,'count':numba.uint16 } )
def findMajor(nparray_count):
	target_num=findTotalNum(nparray_count)/2
	major=102 #no None->numba.uint8, use 102
	
	for index,count in enumerate(nparray_count):
		if count>=target_num:
			major=index
			break

	return major

def worker(seq):
	nparray_seq=np.fromstring(seq,dtype=np.uint8,count=-1,sep=',')
	nparray_count=countNumbers(nparray_seq)
	major=findMajor(nparray_count)
	
	if major==102: #no None for numba.uint8
		return 'None'
	else:
		return major
	
def main():
	total_time=time.time()
	parser=argparse.ArgumentParser()
	parser.add_argument('filename',type=str)
	args=parser.parse_args()

	try:
		sequences=open(args.filename,'r').readlines()
	except:
		print 'None'

	p=Pool(1)
	results=p.map(worker,sequences)

	for result in results:
		print result

if __name__=="__main__":
	main()

▶Anonymous 10/26/17 (Thu) 14:21:07 No.810755>>810758 >>810762 >>810809

>>810743

thats a lot of messy code and imports considering its still not as fast as the c/c++ solutions.

was python a mistake?

▶Anonymous 10/26/17 (Thu) 14:34:44 No.810758

>>810755

if your going to spend the time to optimize everything for numba you might as well write the fucking thing in C and have no restrictions with all the speed. I guess if you have something big in python and only need to speed up a couple of loops it's worth it.

▶Anonymous 10/26/17 (Thu) 14:46:55 No.810762>>810773 >>810777

>>810755

>was python a mistake?

It has its own uses. You wouldn't want to write anything dealing with large amounts of strings, i.e. a web application/CGI in C/C++. It's useful when there's no performance requirement. It's good for quick prototyping. But it's pretty slow on my Android phone, to the point where the environment takes .2s to boot.

▶Anonymous 10/26/17 (Thu) 14:48:00 No.810763>>810764 >>810881

Updates


author      Language        Time, ms        Memory, bytes       Ranking Points out of 35
809731      c++             12              8760                34.972
810406      c               1               32768               34.971
810496      c               17              568                 34.970
810038      c++             2               706153              34.407
810201      js              179             315392              34.424
810410      ruby            221             794624              33.950
810485      py3             196             5017600             30.470
809276      py3             195             5029888             30.461
810221      py2             340             5091500             30.156
810317      perl            93              5940749             29.880
809861      go              933             9961472             25.055

▶Anonymous 10/26/17 (Thu) 14:53:01 No.810764

>>810763

>go is the worst

LOL

▶Anonymous 10/26/17 (Thu) 14:56:25 No.810765>>810767

im sure rust would be last place if it ever managed to run.

afaik the solution posted in this thread crashed or something :^)

▶Anonymous 10/26/17 (Thu) 14:58:42 No.810767

>>810765

rustfag rage quit

▶Anonymous 10/26/17 (Thu) 15:35:58 No.810773>>810796

>>810762

this is the biggest performance impact on the numba code now. i simplified and compressed the code a bit but it's not noteworthy enough to post.


def main():
	t_start=time.time()
	total_time=time.time()
	parser=argparse.ArgumentParser()
	parser.add_argument('filename',type=str)
	args=parser.parse_args()

	try:
		sequences=open(args.filename,'r').readlines()
	except:
		print 'None'
		quit()

	p=Pool(1)
	results=p.map(worker,sequences)

	for result in results:
		print result
	print 'total time:'+str(time.time()-t_start)


time python2.7 challenge_new.py testdata
...
total time:0.0445680618286

real	0m0.406s
user	0m0.348s
sys	0m0.296s

this 350ms difference has to be python's initialization time.

▶Anonymous 10/26/17 (Thu) 15:57:15 No.810777>>810779

>>810762

>It has its own uses. You wouldn't want to write anything dealing with large amounts of strings, i.e. a web application/CGI in C/C++.

Why not?

▶Anonymous 10/26/17 (Thu) 16:05:39 No.810779

>>810777

this challenge is dealing with large amounts of strings, 40, 100k+ character long strings

▶Anonymous 10/26/17 (Thu) 17:22:15 No.810796>>810813

>>810773

it's the imports


def main():
	print "python is shit"

if __name__=="__main__":
	main()

time python2.7 shit.py

real 0m0.024s

user 0m0.020s

sys 0m0.004s

add


import argparse
import numba
import numpy as np
from multiprocessing import Pool

real 0m0.289s

user 0m0.232s

sys 0m0.196s

▶Anonymous 10/26/17 (Thu) 17:25:54 No.810797>>810803 >>810932

File (hide): 35a2f9cf57aa962⋯.jpg (323.08 KB, 1200x1200, 1:1, 35a2f9cf57aa9621f13695dc4c….jpg) (h) (u)

just learn c or c++ already python anon

▶Anonymous 10/26/17 (Thu) 17:25:57 No.810798>>810846

>>810515

>Now we could write an if / else if / else statement where length 1 is handled in the else so we don't have to do two comparisons for it, but the problem with this is the non-jumping portion of the resulting assembly will be the fastest due to how modern x86 processors work.

Doesn't branch prediction deal with this?

▶Anonymous 10/26/17 (Thu) 17:38:26 No.810799>>810856 >>810913

Now it's my turn!!!!

I am a pedophile, does that give me special points?

I did not look at any of the solutions in this thread, written from scratch.

If I win, I want a little girl as a prize.


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>

uint8_t lookup [58] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
uint8_t lookup2 [58] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 20, 30, 40, 50, 60, 70, 80, 90 };

uint16_t counts[101];

int main(int argc, char *argv[])
{
    FILE* fp = fopen(argv[1], "r");
    int c;
    int position = 0;
    char buffer[3];
    uint8_t cur_num = 0;
    uint16_t length = 0;
    uint16_t target_length;

    memset(counts,0,101*sizeof(unsigned short));

    while(1)
    {
        c = fgetc(fp);

        if((c == ',') || (c == '\n') || (c == EOF))
        {
            if(position == 2)
                cur_num = lookup2[buffer[0]] + lookup[buffer[1]];
            else if(position == 1)
                cur_num = lookup[buffer[0]];
            else if(position == 3)
                cur_num = 100;

            counts[cur_num]++;
            position = 0;
            length++;
        }
        else
        {
            buffer[position] = (char)c;
            position++;
        }

        if((c == '\n') || (c == EOF))
        {
            target_length = length/2;

            int i;
            for(i=0; i<=100; i++)
            {
                if(counts[i] > target_length)
                {
                    printf("%d\n", i);
                    break;
                }

                if(i==100)
                    printf("None\n");
            }

            memset(counts,0,101*sizeof(unsigned short));
            length = 0;
        }

        if(c == EOF)
            break;
    }

    fclose(fp);

    return 0;
}

The lookup and lookup2 tables could be removed and calculation used instead, that would save small number of bytes in memory.

Those shit tables was to be optimization, but maybe they slow down rather than speed up? Adding and multiplying by 10 should be cheap process, but is reading from table a cheap process?

▶Anonymous 10/26/17 (Thu) 17:52:40 No.810803>>810805

>>810797

that's the only solution, but i didn't realize how slow those imports were.

if your going to do a website in django, with all that shit it's going to import and whatever else your importing, it doesn't matter how fast your python code is, the response time isn't going to be any faster than those imports, which could be 300m or worse by themselves.

▶Anonymous 10/26/17 (Thu) 17:54:55 No.810805>>810849

>>810803

i've done django before on some things, and as far as I know it's a fresh launch on that python every single time, so those imports are going to run and lag every single request.

▶Anonymous 10/26/17 (Thu) 18:11:03 No.810809

>>810755

It's much easier to write working things in Python than in C/C++. There's a tradeoff between running time and development time, and Python is often the most sensible choice. Especially if you're the only user, or if the bottleneck is going to be outside your code either way (in the network, for example).

If you're writing a large application then it's possible that only 1% of your code is performance-sensitive. Optimizing that could be better than using a faster, harder language for the entire application. Writing only the performance-sensitive part in a faster language is possible too, but it really depends on the situation which approach is best. Writing fast code in a slow language can come in handy.

Not all of this is based on personal experience so take it with a grain of salt

▶Anonymous 10/26/17 (Thu) 18:17:33 No.810813>>810822

>>810796

to be fair to python i benchmarked each individual import.

argparse:0.00708198547363

numba:0.185921907425

numpy:1.90734863281e-06

multiprocessing:6.91413879395e-06

argparse of all things took the longest out of the standard library, but none of them are as terrible as numba itself, which takes 185ms.

▶Anonymous 10/26/17 (Thu) 18:36:27 No.810822>>810826

>>810813

>argparse of all things took the longest out of the standard library

It's because multiprocessing is lazy with importing.

The two standard library modules here are argparse and multiprocessing.

$ python3 -m timeit -s 'import multiprocessing, importlib' 'importlib.reload(multiprocessing)'
10000 loops, best of 3: 180 usec per loop
$ python3 -m timeit -s 'import argparse, importlib' 'importlib.reload(argparse)'
1000 loops, best of 3: 940 usec per loop

argparse is a simple module. It's just a single file, argparse.py, 89K and almost entirely function and class definitions. When you import argparse that file gets evaluated. Simple enough.

multiprocessing is a more full-featured module. It's a directory with a init.py and a number of other modules, including pool.py. init.py adds modules from the context module (context.py). It takes them from a BaseContext object, which has a lot of method definitions, one of which is this:

def Pool(self, processes=None, initializer=None, initargs=(),
         maxtasksperchild=None):
    '''Returns a process pool object'''
    from .pool import Pool
    return Pool(processes, initializer, initargs, maxtasksperchild,
                context=self.get_context())

When you only import and use Pool from multiprocessing, only the scaffolding submodules and the actual pool.py submodule get evaluated. But the pool.py module doesn't even get evaluated until you create a Pool object for the first time (every time after that the import is cached). Notice what happens if we force the import of pool.py:

$ python3 -m timeit -s 'from multiprocessing import pool; import importlib' 'importlib.reload(pool)'
1000 loops, best of 3: 412 usec per loop

Or re-importing the scaffolding as well:

$ python3 -m timeit -s 'from multiprocessing import pool; import multiprocessing, importlib' 'importlib.reload(multiprocessing); importlib.reload(pool)'
1000 loops, best of 3: 612 usec per loop

Importing Pool from multiprocessing is still faster than importing argparse (as you'd expect, because pool.py is just 26K), but it's not that much faster. multiprocessing does well in your benchmark because it only imports what's necessary and delays it until you use it.

▶Anonymous 10/26/17 (Thu) 18:43:58 No.810826

>>810822

thanks for the clarification, i was wondering what the fuck was going on there with argparse taking way longer than like multiprocessing.

▶Anonymous 10/26/17 (Thu) 19:21:29 No.810846

>>810798

A linear path is usually faster regardless of prediction. It really depends on the processor as this changes dramatically every time Intel touches anything and there are a lot of moving parts. Example of dramatic change: unaligned access is essentially free on i7.

▶Anonymous 10/26/17 (Thu) 19:27:17 No.810849

>>810805

Python websites are usually webapps because of how absurdly slow it is to start. With django, you'd normally host it as WSGI hosted in an appserver like uwsgi talking to nginx. Then the interpreter only loads once and then again every X requests to fix Python bloating up like a dead fish over time. While you could run it as CGI, it would be unusable for anything more than debugging.

▶Anonymous 10/26/17 (Thu) 19:41:49 No.810856>>810894

>>810799

This uses the exact same amount of memory as that earlier solution by another anon. I guess the memory detect script on the pajeet site broke.

▶Anonymous 10/26/17 (Thu) 20:27:20 No.810881>>810884 >>810892

>>810763

So basically, everything other than C and C++ is a meme language.

▶Anonymous 10/26/17 (Thu) 20:30:01 No.810884

>>810881

Your not wrong

▶Anonymous 10/26/17 (Thu) 20:31:13 No.810885>>810891

Where is Java solution???

▶Anonymous 10/26/17 (Thu) 20:35:05 No.810891>>810893

>>810885

It's still starting up.

▶Anonymous 10/26/17 (Thu) 20:39:40 No.810892

>>810881

at maximum autism this is the fastest /tech/ can make an interpreted language go on a simple problem.

▶Anonymous 10/26/17 (Thu) 20:40:17 No.810893

File (hide): 7edf1d74b7bb552⋯.jpg (8 KB, 265x265, 1:1, 1462308840929.jpg) (h) (u)

>>810891

my sides were not prepared for that

▶Anonymous 10/26/17 (Thu) 20:41:35 No.810894>>810895

>real 0m0.406s

>user 0m0.348s

>sys 0m0.296s

How do you people get that? What should I click to get this numbers?

>>810856

>This uses the exact same amount of memory as that earlier solution by another anon. I guess the memory detect script on the pajeet site broke.

so you upload and tested it?

show results and post updated result list

▶Anonymous 10/26/17 (Thu) 20:42:26 No.810895>>810898 >>810913

>>810894

"time", in the shell. Alternatively, /usr/bin/time is a different tool that shows some useful stuff in --verbose, but has shit resolution. perf is also useful.

▶Anonymous 10/26/17 (Thu) 20:46:36 No.810898>>810899 >>810900

>>810895

time also gives different output if you have nice infront of it i discovered.


time python2.7 challenge_new.py testdata
real	0m0.412s
user	0m0.336s
sys	0m0.176s

nice -20 time python2.7 challenge_new.py testdata
total time:0.0473449230194
0.36user 0.29system 0:00.44elapsed 149%CPU (0avgtext+0avgdata 74708maxresident)k
0inputs+32outputs (0major+21093minor)pagefaults 0swaps

i don't know how or why that is, time has a format option but I don't see any reference to it in nice

▶Anonymous 10/26/17 (Thu) 20:47:09 No.810899

>>810898

ignore "total time" that's coming from python i should have deleted that line

▶Anonymous 10/26/17 (Thu) 20:48:39 No.810900>>810907 >>810916

>>810898

>time also gives different output if you have nice infront of it i discovered.

That's because the first command uses the shell built-in, and the second command uses /usr/bin/time by resolving $PATH.

nice can't access shell built-ins because it's not part of your shell. Try running "env time". It'll give the same kind of output as "nice -20 time".

Running "time nice" instead of "nice time" should let you use the built-in.

▶Anonymous 10/26/17 (Thu) 20:53:06 No.810904>>810908

how can you see the memory that a process used?

▶Anonymous 10/26/17 (Thu) 20:53:28 No.810907

>>810900

Can also do something like nice bash -c "time ..."

▶Anonymous 10/26/17 (Thu) 20:55:02 No.810908>>810909

>>810904

That's actually a difficult question due to the definition of "memory that a process used". I'd recommend /usr/bin/time --verbose's resident set size, but that's still a pretty squishy number.

▶Anonymous 10/26/17 (Thu) 20:57:07 No.810909>>810911

>>810908

can you just do


nice time --verbose ./a.out

its easier to type than the usr/bin thing

▶Anonymous 10/26/17 (Thu) 20:58:20 No.810911>>810926

>>810909

You can, but use "env" instead of "nice". It's the usual tool used for that purpose and it doesn't have side effects.

▶Anonymous 10/26/17 (Thu) 21:00:13 No.810913>>810925

>>810895

>"time", in the shell. Alternatively, /usr/bin/time is a different tool that shows some useful stuff in --verbose, but has shit resolution. perf is also useful.

what do you mean by shell? windows explorer or cmd?

when I clicked "time" in cmd it shows time and asks me to set new time

can someone test my code >>810799 to show real user sys time?

▶Anonymous 10/26/17 (Thu) 21:02:27 No.810916>>810920

>>810900

wait so time run straight from bash is bash's own version of time, it's not hitting /usr/bin/time, or whatever is in my path. that's some bullshit.

▶Anonymous 10/26/17 (Thu) 21:03:44 No.810919

man bash


RESERVED WORDS
       Reserved words are words that have a special meaning to the shell.  The following words are recognized as reserved when unquoted and either the
       first word of a simple command (see SHELL GRAMMAR below) or the third word of a case or for command:

       ! case  coproc  do done elif else esac fi for function if in select then until while { } time [[ ]]

one of these should change the name

▶Anonymous 10/26/17 (Thu) 21:03:48 No.810920

>>810916

Welcome to Unix.

▶Anonymous 10/26/17 (Thu) 21:05:19 No.810922>>810927 >>811415

>>810262

How do you test this shit? The level of autism to just get lisp implementations to compile something is ridiculous. I used sbcl (this is the 'fast' one, right?), did (load "whatever") then (main "test.txt") (from the python test case generator) and it took 84 seconds. Is that expected or is there some sort of "don't go horribly fucking slow" flag I missed? I also tried compiling it to an executable like,

>(sb-ext:save-lisp-and-die "lispshit" :toplevel #'main :executable t)

but the result wouldn't take a command line argument as far as I could tell.

Lisp was a mistake.

▶Anonymous 10/26/17 (Thu) 21:06:36 No.810925

>>810913

you're baiting aren't you?


real    0m0.017s
user    0m0.000s
sys     0m0.000s

▶Anonymous 10/26/17 (Thu) 21:07:14 No.810926>>810963

>>810911

I'd not recommend using env for anything. That program was a mistake.

▶Anonymous 10/26/17 (Thu) 21:08:07 No.810927>>810929 >>811416

>>810922

Also, the lisp version in addition to taking 84 seconds required 99 megs of memory.

▶Anonymous 10/26/17 (Thu) 21:09:16 No.810929>>810931

>>810927

>>810262

>>810239

i want to see this on the list

▶Anonymous 10/26/17 (Thu) 21:11:14 No.810931

>>810929

I don't know if the site OP uses supports it. I think they were arguing above about that. But I wanted to see some numbers for it so I did it myself.

▶Anonymous 10/26/17 (Thu) 21:11:16 No.810932>>810938 >>811113

>>810797

lol >>810496 doesn't work. So congrats on your fucking testing abilities pajeet.

▶Anonymous 10/26/17 (Thu) 21:12:56 No.810938

File (hide): 6022f5cd49df0a1⋯.jpg (12.64 KB, 225x225, 1:1, 1725834573237.jpg) (h) (u)

>>810932

thanks

▶Anonymous 10/26/17 (Thu) 21:31:46 No.810963>>810978

>>810926

I think env is a horrible hack, but what would you use instead?

"#!/usr/bin/env interpreter" is awful, but so is anything that doesn't actually work.

▶Anonymous 10/26/17 (Thu) 21:41:48 No.810978>>810991

>>810963

Give the standardized path to the program. Part of LSB autism was to remove need for env hacks.

▶Anonymous 10/26/17 (Thu) 21:50:28 No.810991>>810994

>>810978

That's nice in theory, but LSB only covers Linux, and most distros have given up the pretense of following it.

It's a gain in compatibility, a loss in purity, but no loss in functionality or functional correctness. It's better than not doing it.

▶Anonymous 10/26/17 (Thu) 21:51:31 No.810994

>>810991

I've not had to use env in probably 15 years. I'd re-evaluate what you're doing if you find it necessary.

▶Anonymous 10/26/17 (Thu) 21:59:40 No.811008>>811023 >>811028

>>808650 (OP)

>>808655

>>808660

>"Coding challenges for the world's best developers."

>Posting baby-tier problems that you should be able to solve in your first week of programming.

>Has to look for other people to solve it for him.

>Bonus for rust.

Do your own homework, faggot.

▶Anonymous 10/26/17 (Thu) 22:08:46 No.811023

>>811008

>no fun allowed

>lets all larp about programming but never do any actual programming

▶Anonymous 10/26/17 (Thu) 22:11:58 No.811028

>>811008

Even a babby-tier problem can be autistically optimized as a challenge. You afraid to compete, anon?

▶Anonymous 10/26/17 (Thu) 22:24:23 No.811037

i've been thinking about writing this in bash but i don't think i have the motivation for that

▶Anonymous 10/26/17 (Thu) 22:51:47 No.811055>>811057 >>811064

>>810485

I made this almost three times as fast by keeping the numbers as strings instead of converting them:

from sys import argv
from collections import Counter
for line in open(argv[1]):
    nums = line.strip().split(',')
    [(major, count)] = Counter(nums).most_common(1)
    print(major if count > len(nums) / 2 else None)

I also checked if reading the file as plain bytes instead of unicode strings made it faster, but that made it a bit slower for some reason.

▶Anonymous 10/26/17 (Thu) 22:54:12 No.811057>>811061 >>811062

>>811055

Python3 likely converts it to unicode when passed to counter. How does that run on python2?

▶Anonymous 10/26/17 (Thu) 23:00:30 No.811061

>>811057

just checked, runs fine on 2.7 with zero modifications.

▶Anonymous 10/26/17 (Thu) 23:01:31 No.811062>>811071 >>811072

>>811057

Counter is a dict subclass. It takes anything hashable, not just strings, and Python is strongly typed, so no unicode conversion. The implementation is simple enough that I'm experimenting with just doing the counting manually.

I tried Python 2, and to my surprise, it was more than two times as slow. Using unicode in Python 2 doesn't help.

▶Anonymous 10/26/17 (Thu) 23:01:57 No.811064

>>811055

counter probably doesn't care, it'll count any data type in any list

▶Anonymous 10/26/17 (Thu) 23:08:24 No.811071>>811087

>>811062

>The implementation is simple enough that I'm experimenting with just doing the counting manually.

I just discovered that I was looking at a pure Python fallback, and CPython runs a C implementation of the critical function, so this is not going to happen.

All the heavy lifting is done through things implemented in C at this point so I don't think it'll get much faster without witchcraft.

▶Anonymous 10/26/17 (Thu) 23:09:11 No.811072>>811086

>>811062

>I tried Python 2, and to my surprise, it was more than two times as slow.

Weird - python2 is usually faster. Maybe you've discovered the one case that justifies the existence of python3.

▶Anonymous 10/26/17 (Thu) 23:17:58 No.811086

>>811072

I think it's because Python 2's Counter doesn't have the bottleneck written in C. Specifically, it's these two lines:

for elem in iterable:
    self[elem] = self_get(elem, 0) + 1

In Python 3 they're moved into a separate function that's replaced by a function of the same name imported from _collections if possible.

Python 3 is a much better language than Python 2, but performance improvements are usually because of things that could be backported to Python 2.

▶Anonymous 10/26/17 (Thu) 23:18:34 No.811087>>811091

>>811071

>witchcraft

>>810743

this will do it in 40ms if i can somehow reduce numba's 200ms import time

▶Anonymous 10/26/17 (Thu) 23:22:44 No.811089

>>809485

That guy isn't me. The 99 thing was a joke about the interval notation. Only pretending to be retarded, etc..

▶Anonymous 10/26/17 (Thu) 23:23:17 No.811091

>>811087

you can seperate the numba shit and compile it to a module that can be imported without numba installed, i've thought about compiling it seperatly, compressing it to a string, including it in the source, and then having it decompress the string into a python tempfile, which is hopefully in ram, and importing the module from there, but i'd bet it's going to take longer than 200ms and not be worth the effort

▶Anonymous 10/26/17 (Thu) 23:34:19 No.811099>>811100 >>811103

>>809866

>>809861

>933

wtf?


real	0m0.001s
user	0m0.000s
sys	0m0.002s

Where does 933 come from?

▶Anonymous 10/26/17 (Thu) 23:36:01 No.811100>>811101

>>811099

What happens if you use go run?

▶Anonymous 10/26/17 (Thu) 23:36:57 No.811101

>>811100

real	0m0.132s
user	0m0.140s
sys	0m0.025s

▶Anonymous 10/26/17 (Thu) 23:40:09 No.811103>>811108

>>811099

How do you build/run go so it goes fast? I could try running it here.

▶Anonymous 10/26/17 (Thu) 23:44:19 No.811108>>811109

>>811103

Not sure what you mean. I have go version go1.9.1 linux/amd64. Then time the go build/go run.

▶Anonymous 10/26/17 (Thu) 23:46:46 No.811109>>811111

>>811108

Running it here, it doesn't produce the same output. Seems to give up after 5 tests.


root@sid:~/src# ./809861 test.txt
None
0
0
None
None

This is the test data I'm using:

https://mega.nz/#!VPoAWRwQ!MNyWOflyP3HbfCrcCn0WRp008dQJPyixKgBAUphC240

▶Anonymous 10/26/17 (Thu) 23:49:01 No.811111>>811112

>>811109

$ time ./challenge test.txt
None
0
0
None
None

real	0m0.014s
user	0m0.014s
sys	0m0.001s

▶Anonymous 10/26/17 (Thu) 23:50:33 No.811112>>811115

>>811111 (checked)

Yeah, but there's 40 lines in that file, not 5.

▶Anonymous 10/26/17 (Thu) 23:51:43 No.811113

>>810932

I thought all the numbers had to be positive. Whoops.

Here's it is now. I hope it works this time...

#include <stdio.h>
#include <stdlib.h>
#define MAXLEN 30000

int major_element(int *, int);

int main(int argc, char *argv[]){
    FILE *fp;
    char c;
    char snum[6];
    char *p = snum;

    int numbers[MAXLEN];
    int len = 0;

    int major;

    if(argc < 2){
        printf("Need an argument\n");
        return 0;
    }
    fp = fopen(argv[1], "r"); 
    if(fp == NULL){
        printf("couldn't open file\n");
        return 0;
    }
    while((c = fgetc(fp)) != EOF){
        if('0' <= c && c <= '9'){
            *p++ = c;
        }
        else if(c == ','){
            *p = '\0';
            numbers[len++] = atoi(snum);
            p = snum;
        }
        else if(c == '\n'){
            *p = '\0';
            p = snum;
            numbers[len] = atoi(snum);
            major = major_element(numbers, len);
            if(major == -1){
                printf("None\n");
            }
            else{
                printf("%d\n", major);
            }
            len = 0;
        }
        else{
            return 0;
        }
    }
    
    fclose(fp);
    return 0;
}

int major_element(int *list, int len){
    int count = 0;
    int major = 0;
    int occ = 0;
    int *listcp = list;

    for(int i = 0; i < len; i++){
        if(listcp[i] == major){
            count++;
        }
        else if(count > 0){
            count--;
        }
        else{
            count = 1;
            major = listcp[i];
        }
    }
    for(int i = 0; i < len; i++){
        if(list[i] == major){
            occ++;
        }
    }
    if(occ > len/2){
        return major;
    }
    return -1;
}

▶Anonymous 10/26/17 (Thu) 23:52:45 No.811115>>811117

>>811112

Those are the numbers from your file. It is still 14ms, nowhere near 933. Where that number comes from?

Sorry I don't understand. pls no bully

▶Anonymous 10/26/17 (Thu) 23:56:25 No.811117>>811120 >>811125 >>811145

>>811115

What I mean is, the test.txt file has 40 lines, 40 tests. The output of your program only shows 5 tests. It's skipping a huge amount of the test data for some reason.

I cut your program down to something that should just print line by line (I think, as I don't know go):


package main

import (
        "bufio"
        "fmt"
        "os"
)

func process(filename string) {
        f, err := os.Open(filename)
        if err != nil {
                panic(err)
        }
        defer f.Close()
        s := bufio.NewScanner(f)
        for s.Scan() {
                fmt.Println("None")
        }
}

func main() {
        if len(os.Args) != 2 {
                fmt.Printf("usage: %s [input]\n", os.Args[0])
                return
        }
        process(os.Args[1])
}

But it still only prints 5 lines. Can you see what's up? is "for s.Scan()" doing the right thing?

▶Anonymous 10/26/17 (Thu) 23:58:30 No.811119

imo this might end up being a bug in go.

▶Anonymous 10/26/17 (Thu) 23:58:31 No.811120>>811134

>>811117

It is not my program. I didn't check the code, I just copied the source code and tried the test. I don't program in Go. I will try to fix it.

▶Anonymous 10/27/17 (Fri) 00:06:03 No.811125>>811126 >>811129 >>811135

>>811117

Ahaha, yeah, it's a bug/"feature" in go. Great job, faggots at Google.

If you add,


        if err := s.Err(); err != nil {
                log.Fatal(err)
        }

The silent error the scanner is generating (what the fuck kind of systems language silently truncates a file?) will get logged:


2017/10/26 17:01:01 bufio.Scanner: token too long
exit status 1

The textbook 'iterate over lines of a file' go example will fail on long lines. These aren't even that long. What a fucking disaster. Rather than fix it, they documented it,


Scanning stops unrecoverably at EOF, the first I/O error, or a token too large to fit in the buffer. When a scan stops, the reader may have advanced arbitrarily far past the last token. Programs that need more control over error handling or large tokens, or must run sequential scans on a reader, should use bufio.Reader instead.

Why would you leave garbage like that in a language? It's just a landmine for novice users to create security holes with.

▶Anonymous 10/27/17 (Fri) 00:07:55 No.811126>>811222

>>811125

Isn't failing mysteriously on long lines an ancient Unix tradititon?

▶Anonymous 10/27/17 (Fri) 00:10:16 No.811129>>811133 >>811135

File (hide): 7766c8b05191806⋯.png (99.78 KB, 651x706, 651:706, goatlang.png) (h) (u)

>>811125

The golang community, everyone:

https://twitter.com/davecheney/status/604837853344989184?lang=en

▶Anonymous 10/27/17 (Fri) 00:17:37 No.811133

File (hide): 9a6a5b5e888d71d⋯.png (66.68 KB, 646x332, 323:166, readline_only_for_experts.png) (h) (u)

>>811129

fucking lol

▶Anonymous 10/27/17 (Fri) 00:18:25 No.811134

>>811120

don't even waste your time, fuck Go

▶Anonymous 10/27/17 (Fri) 00:19:11 No.811135>>811138 >>811140

>>811129

>>811125

> It's just a landmine for novice users to create security holes with.

It is well documentated and it works like advertised. What is the problem?

▶Anonymous 10/27/17 (Fri) 00:23:58 No.811138>>811141

>>811135

https://golang.org/pkg/bufio/#Scanner

this reads as being literally no different than a readlines function but with arbitrary undocumented limits. is there some kind of performance benefit?

▶Anonymous 10/27/17 (Fri) 00:25:44 No.811140>>811180

>>811135

gets() was also well documented and worked as advertised.

Look at what happened here - all over twitter and stackoverflow are people recommending using Scanner to iterate over lines even though it will fuck up (silently!) as soon as it's fed a long line, for some unknown definition of long. As a result, anon wound up learning the wrong way to do things and produced broken code. Something that broken should not be part of a language as what happened here can happen.

▶Anonymous 10/27/17 (Fri) 00:27:32 No.811141

>>811138

I think the main reason is using Reader is a pain in the ass compared to Scanner. I was just trying it and even converting bytes[] to a string is a pain in the ass.

▶Anonymous 10/27/17 (Fri) 00:29:57 No.811142>>811147

File (hide): ad58bf9ab16201b⋯.png (83.43 KB, 664x465, 664:465, thumbs_up_for_abstraction.png) (h) (u)

you people just need to learn the correct level of abstraction is

▶Anonymous 10/27/17 (Fri) 00:33:47 No.811145>>811146

>>811117

New hopefully correct time now.

$ time ./challenge test.txt 
None
0
0
None
None
None
None
None
None
0
None
0
None
None
None
0
0
None
None
0
0
None
None
0
0
None
None
None
None
0
0
None
None
None
0
0
None
None
None
None

real	0m0.041s
user	0m0.041s
sys	0m0.003s

▶Anonymous 10/27/17 (Fri) 00:35:43 No.811146>>811149

>>811145

Just increased buffer size to the anons code.

s := bufio.NewScanner(f)
const maxCapacity = 512*1024
buf := make([]byte, maxCapacity)
s.Buffer(buf, maxCapacity)

▶Anonymous 10/27/17 (Fri) 00:36:12 No.811147

>>811142

Jesus what a terrible community. I guess they all came from webdev so it figures.

▶Anonymous 10/27/17 (Fri) 00:38:05 No.811149>>811151

>>811146

How hard is it to change it to accept any line length?

▶Anonymous 10/27/17 (Fri) 00:41:33 No.811151

>>811149

maybe use the readlines thing, get the max lengths of the lines, and then use scanner on it again so you can get the correct level of abstraction

▶Anonymous 10/27/17 (Fri) 00:42:50 No.811154>>811163

File (hide): 31b667ef087e3da⋯.png (127.69 KB, 764x1076, 191:269, lol.png) (h) (u)

Almost everything when searching for how to iterate over multiple lines in go in a safe/general way is wrong. And as a bonus, everyone's leaving out error handling (why is this even optional in 3 CYE?). What a trainwreck.

▶Anonymous 10/27/17 (Fri) 01:03:24 No.811163

>>811154

>In each loop we print the line returned by the scan.

>we

these faggots resort to shilling their shitty language in stackoverflow

▶Anonymous 10/27/17 (Fri) 01:58:51 No.811180>>811192

>>811140

>will fuck up (silently!)

It doesn't fail silently. You have to check for errors (Err). This is working as intended.

>b-but I trusted a Pajeet's code on stackoverflow intead of reading docs and it didn't have error checking

If you are too retarded to error check yourself then use Java.

▶Anonymous 10/27/17 (Fri) 02:02:16 No.811181

This code is GPL do not steal. My code. Warning my code must give code to use. By reading this you have agreed to give me your code. Thanks you kindly sirs.


#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
  if(argc == 1) goto retard;
  FILE *fp = fopen(argv[1], "r");
  if(fp == NULL) goto retard;

  int buf_ind = 0, len = 0;
  int counts[101] = {0};
  char c, buf[5] = {0}, *t;
  while((c = fgetc(fp)) != EOF){
    buf[buf_ind++] = c;
    if(c == ',' || c == '\n'){
      int val = strtol(buf, &t, 10);
      if(val == 0 && t == buf) goto retard;
      buf_ind = 0;
      counts[val] ++;
      len++;
      if(c == '\n'){
        len >>= 1;
        int res = -1;
        for(int i=0; i<101; i++){
          if(counts[i] > len) res = i;
          counts[i] = 0;
        }
        len = 0;
        (res < 0) ? printf("None\n") : printf("%i\n",res);
      }
    }
  }
  return 0;
 retard:
  printf("Retard\n");
  return -1;
}

Free software. Give me code. Thank you sirs.

▶Anonymous 10/27/17 (Fri) 02:30:19 No.811192

>>811180

There is no excuse for optional error checking in a modern language.

▶Anonymous 10/27/17 (Fri) 04:03:54 No.811220>>811227

Code I would actually accept in a sysadmin context (performance not that important; reviewability very important; lots of small scripts so something of this scale would actually show up):

>>809276

>>809324 (though a binary for any small task is an inconvenience; you have to find the source to be sure of what it does; you can't do a quick fix or add some troubleshooting code)

>>810201 (JS solution)

... this is depressing. I'll leave it at that. It's basically a list of every entry except the Python entries anyway, which are just fucking revolting, every single one of them. Python's terrible but there's no reason for it to be this bad. What you do with a shitty scripting language is, you solve the problem in a natural and direct way, and if it's a BS website-tester problem with inputs designed to stress a hash-table or whatever, then you say "job done" and move on to more real problems. If your natural and direct way is too slow for a real-world problem, then you try and solve it with architecture (run less often / use caching / change requirements) or you use a more serious language.

▶Anonymous 10/27/17 (Fri) 04:11:54 No.811222

>>811126

the mantra is "0, 1, or Many"

and the mantra is not followed by anything important. Servers have finite RAM, finite CPUs, finite storage. Software running on servers therefore have tunable finite limits. Similarly when you put up a "my music collection" bookshelf you don't let someone base64-encode their 2TB harddrive and store the result in the name of a song, to use your website as their personal backup service.

so yeah fuck infinite lines.

▶Anonymous 10/27/17 (Fri) 04:13:55 No.811224

... well the thread's no longer bumping so I guess no need to look for the next terrible Python solution.

▶Anonymous 10/27/17 (Fri) 04:24:52 No.811227>>811237

>>811220

>I pick stuff that's aesthetic

>picks a bunch of non-functioning shit

lmfao

▶Anonymous 10/27/17 (Fri) 05:04:06 No.811237>>811243 >>811262

>>811227

according to this shitty website with a terrible spec and unexaminable inputs and no error messages or ability to troubleshoot. Real world, problems with these would just be fixed, very easily.

It doesn't much apply here, but in general, optimization is specialization: if you have multiple platforms, or if the upstream changes, or if a bug causes some of the data to be (correctibly) wrong, then it's usually easier to adapt a general solution than an optimized one.

▶Anonymous 10/27/17 (Fri) 05:19:10 No.811243>>811256

>>811237

I always do general solutions until I need optimized solutions, but people abuse that to justify absolute garbage that will never be able to be converted into an optimized solution.

▶Anonymous 10/27/17 (Fri) 05:56:11 No.811256

>>811243

there's no wisdom that can't be abused by a fool.

▶Anonymous 10/27/17 (Fri) 06:19:35 No.811262>>811267

>>811237

Which one is your attempt?

▶Anonymous 10/27/17 (Fri) 06:53:28 No.811267>>811284 >>811288

>>811262

obv. it's one of the failures.

this pajeet site is just as bad: http://www.spoj.com/submit/TEST/

but if you try to submit a solution you'll see how many more languages it offers: Ada, Dart, Forth, Common Lisp, Nim.

▶Anonymous 10/27/17 (Fri) 07:45:20 No.811284

File (hide): 26487239ce8e241⋯.png (153.3 KB, 652x904, 163:226, activejeet.png) (h) (u)

>>811267

"pajeet site" is not a joke

▶Anonymous 10/27/17 (Fri) 07:49:47 No.811288>>811289

>>811267

is this challenge serious?

TEST - Life, the Universe, and Everything

#basic #tutorial #ad-hoc-1

Your program is to use the brute-force approach in order to find the Answer to Life, the Universe, and Everything. More precisely... rewrite small numbers from input to output. Stop processing input after reading in the number 42. All numbers at input are integers of one or two digits."

Example


Input:
1
2
88
42
99

Output:
1
2
88

....


#!/bin/python2.7
while(1==1):
	input=raw_input()
	if input=='88':
		print input
	else:
		break

/tech/ - Technology★

General

WebM

Theme

User JS

Do not paste code here unless you absolutely trust the source or have read it yourself!

Favorites

Customize Formatting

Filters