存档

‘Written’ 分类的存档

[Tip]awk笔记

2010年5月19日 Galaxy 没有评论

指定输出的域分隔符:

awk -vOFS="\t" '{print $1,$2,$3,$4,$6}'
awk 'BEGIN { OFS = "\t" } {print $1,$2,$3,$4,$6}'

阅读全文…

分类: Written 标签: , ,

[整理]用R算N50

2009年11月11日 Galaxy 没有评论

说是整理,其实,先粘过来再说。

* N50 is calculated by first ordering all contigs by size and then adding the lengths (starting from the longest contig) until the summed length exceeds 50% of the total length of all contigs.

http://www.opensubscriber.com/message/r-help@r-project.org/10844139.html
阅读全文…

分类: Written 标签: ,

[原创]让记事本新建文件默认为UTF8

2009年11月5日 Galaxy 2 条评论

How to make Notepad use utf8 as default encoding for new Text file?

It is easy since whenever Notepad open an utf8 text file that with BOM, it will use utf8 for that file. And when we create a new .txt file with new->Text file, we just use the default NullFile.

So, what if the NewFile.txt is EFBBBF instead of an null file ?
Well, the problem just solved.
阅读全文…

分类: Written 标签: , ,

How to call a object with shorter name in Perl

2009年9月16日 Galaxy 没有评论

http://stackoverflow.com/questions/1430548

I am writing a perl module “Galaxy::SGE::MakeJobSH” with OO.

I want to use MakeJobSH -> new() instead of Galaxy::SGE::MakeJobSH -> new(), or some other shortnames.

So, anyway to do it ?


You can suggest that your users use the aliased module to load yours:

use aliased 'Galaxy::SGE::MakeJobSH';
my $job = MakeJobSH->new();

Or you could export your class name in a variable named $MakeJobSH;

use Galaxy::SGE::MakeJobSH;  # Assume this exports $MakeJobSH = 'Galaxy::SGE::MakeJobSH';
my $job = $MakeJobSH->new();

Or you could export a MakeJobSH function that returns your class name:

use Galaxy::SGE::MakeJobSH;  # Assume this exports the MakeJobSH function
my $job = MakeJobSH->new();

I’m not sure this is all that great an idea, though. People don’t usually have to type the class name all that often.

Here’s what you’d do in your class for the last two options:

package Galaxy::SGE::MakeJobSH;
 
use Exporter 'import';
our @EXPORT = qw(MakeJobSH $MakeJobSH);
 
our $MakeJobSH = __PACKAGE__;
sub MakeJobSH () { __PACKAGE__ };

Of course, you’d probably want to pick just one of those methods. I’ve just combined them to avoid duplicating examples.

分类: Written 标签: , ,

至今写的最长的Linux管道命令

2009年8月27日 Galaxy 20 条评论
ll -rt ./_log/*.o* |awk '{if($5==84){print $9}}'
 | perl -ne '/(GP.*Chr.*\.sh)/;print "$1\n";' 
 | while read a; do find . -name "$a" ;done 
 | while read ss; do qsub -l vf=280M -cwd $ss;done

……
原因嘛,有些计算节点默认的SHELL不是bash,如果没shell-bang就出错到STDOUT:

Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.

也有可能是其他原因,反正就这了……


后记,这样还是不行。也不知道计算节点的配置到底是出啥问题了。

后后记:结果嘛,# -S /bin/bash

玩了下Perl正则的环顾断言

2009年8月12日 Galaxy 3 条评论

正则表达式也算用了不少了,该耍耍高级的了……

1. 把样品名按最后一组数字排序。

sub sortlibname {	# Not many libs, so no Schwartzian needed.
	my ($aa,$bb,$at,$bt);
	$a =~ /^.*((?<=\D)\d+)(\D*)/;	# for AR2202-3m, we need (3,m).
	$aa=$1;$at=$2;
	$b =~ /^.*((?<=\D)\d+)(\D*)/;
	$bb=$1;$bt=$2;
	$aa <=> $bb
		||
	$at cmp $bt;
}
 
foreach my $sample (sort sortlibname keys %hSampleLib) {}

2. 把对chromosome的所有缩写,从头3个字符到全写的都给抠掉。(这个有些EP了……)

my ($chrid,$svtype,$start,$end)=(split /\t/)[0,1,4,5];
$chrid =~ s/^chr
	(?>
		((?<=^chr)o)?
		((?<=^chro)m)?
		((?<=^chrom)o)?
		((?<=^chromo)s)?
		((?<=^chromos)o)?
		((?<=^chromoso)m)?
		((?<=^chromosom)e)?
	)//xi;

其实,2比1先写。要是没1,我哪好意思开帖呢……

优化?目前就这水平,以后再慢慢啃。

试用Perl_XS

2009年7月24日 Galaxy 1 条评论

要查3个表,虽然SQLite还比较快,可那句”SELECT score FROM dbCNSblk$opt_s WHERE chrid=? AND ? BETWEEN begin AND end”还是太慢,只好塞内存。
反正是为了效率,不如来玩玩C,所以……

h2xs -A -n ChromByte

ChromByte/ChromByte.xs

#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#include "ppport.h"
#include <malloc .h>
 
MODULE = ChromByte              PACKAGE = ChromByte
 
long
initchr(len)
        int len
        CODE:
                void *address = malloc( len + 1 );
                memset( address , 0 , len + 1 );
                RETVAL = (long)address;
        OUTPUT:
                RETVAL
 
void
setbases( address, begin, end, val )
        long address
        int begin
        int end
        int val
        CODE:
                char * buf = ( char * ) address ;
                memset( buf + begin , val , end - begin + 1 );
 
int
getbase( address, pos )
         long address
         int pos
        CODE:
                char * buf = ( char * ) address ;
                RETVAL = *( buf + pos );
        OUTPUT:
                RETVAL
 
void
freechr( address )
        long address
        CODE:
                void * buf = ( void * ) address ;
                free( buf );
 
</malloc>

还是说明下,这段C是找李旭帮忙写,然后我改的。本人目前还是会读C但没记语法的阶段(貌似已经这样8年了……)。
有空一定要去把C和Cpp补完!

阅读全文…

分类: Written 标签: ,
Locations of visitors to this page