A shallow clone is a git clone of only part of a repository’s git commit history. There are three ways that we found that this can be done. This can be useful in situations where you need to download only part of the files in a large git repository. As a bonus we also cover how to fetch a single commit and partial clones.
If you want to improve the overall speed of git commands then we suggest you upgrade to the latest versions which have significant improvements done to the commit-graph.
Three ways To Shallow Clone
The only prerequisite is that you have at least git version 1.9 and the command is basically as follows:
git clone <repository> --depth 1
This will download only the latest commits instead of the entire repository. This should make things considerably faster. If you are using a newer version of git 2.11.1 you can also use shallow-exclude with a remote branch or tag.
git clone <repository> --shallow-exclude=<remote branch or tag>
Or
git clone <repository> --shallow-since=YYYY-MM-DD
Bonus: Fetch A Single Commit
As from git version 2.5 you are allowed to fetch a single commit. git-upload-pack which is typically not used by the user but runs as part of git fetch. You can use git clone to get a shallow clone in combination with git-upload-pack and the SHA1 of a commit to fetch a single commit. Let’s say for example we want to fetch only this commit from Apache Flink https://github.com/apache/flink/commit/c0cf91ce6fbe55f9f1df1fb1626a50e79cb9fc50
For those who are short on time please find a minimal snippet for correctly logging to CSV with Python3.8:
*READ BELOW TO SEE HOW WE ADD A HEADER AND DO LOG ROTATION
import logging
import csv
import io
class CSVFormatter(logging.Formatter):
def __init__(self):
super().__init__()
def format(self, record):
stringIO = io.StringIO()
writer = csv.writer(stringIO, quoting=csv.QUOTE_ALL)
writer.writerow(record.msg)
record.msg = stringIO.getvalue().strip()
return super().format(record)
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
# loggingStreamHandler = logging.StreamHandler()
loggingStreamHandler = logging.FileHandler("test.csv",mode='a') #to save to file
loggingStreamHandler.setFormatter(CSVFormatter())
logger.addHandler(loggingStreamHandler)
logger.info(["ABDCD", "EFGHI"])
#READ BELOW TO SEE HOW WE ADD A HEADER AND DO LOG ROTATION
Let’s explain some of the above code further, the implications and then let’s also see how to incorporate writing a header. Parts about the logger, log levels and the formatter are explained in the previous post, feel free to catchup quickly if you would like: https://everythingtech.dev/2021/03/python-logging-with-json-formatter/
CSV Formatter
Just like we did for the JSON Formatter, we are extending the logging Formatter class to modify the override the format method. We are using the CSV and the IO Library here also means that this has a few performance implications. If performance is critical for you and/or logging forms a significant part of your system then I would recommend that you benchmark your system before and after adding the CSV logging part.
Anyway, moving on, we are calling csv.writer which needs a destination to write the CSV lines and quoting settings. In this case we have provided a buffered stream of bytes of character and numbers, a StringIO object from the IO library. We do this because we are already saving the log to disk with our FileHandler.
Adding CSV Header & Rotating Logs
Adding a CSV Header is trickier than it looks, specially because we are looking to add log rotation. This means that we have write a header before a log is created and just after it is rolled over. Turns out that TimedRotatingFileHandler deletes the stream it is writing to just before rolling over and the stream is not accessible before the initialisation. There are 2 options to bypass this that I thought about (may be there is a better one? If you know it please comment below): Option 1: Rewrite the constructor of TimedRotatingFileHandler. But this also means that any breaking change in the library could prevent our application to stop working. Option 2: Check the file size before writing to the log and if it is empty write the header. This is expensive though because we basically have an additional operation before every write (emit function). In this post, until I find a better way we used this method.
So the first part of the code is quite straightforward, we are extending the TimedRotatingFileHandler class. You will notice other useful parameters like When and Interval. These 2 are important will determine when your logs will rotate. The following are possible parameters for When taken from original code.
# Calculate the real rollover interval, which is just the number of
# seconds between rollovers. Also set the filename suffix used when
# a rollover occurs. Current 'when' events supported:
# S - Seconds
# M - Minutes
# H - Hours
# D - Days
# midnight - roll over at midnight
# W{0-6} - roll over on a certain day; 0 - Monday
# Case of the 'when' specifier is not important; lower or upper case
# will work.
We have added additional constructor parameters like the Header, retryCount and retryInterval. The Header is to make sure you can specify header a CSV Header. The retryCount and retryInterval have been added to make sure that the Header is written first in the log files. Logs are written asynchronously and if you take a look at the emit code we have to make sure that the Header is written first before any logs are written.
Of course, we do not want to be stuck indefinitely in a loop. At every retryInterval a check is performed to make sure we have written the header and this is retried a default number of 5 times before it raises a dummy exception that we created. You can then handle it as you would like but it should work fine.
This has to be repeated every time we are rolling over as well that is why we assigned a constant at first to restore the retry to its current value. The way we are writing the header is to use the stream created by the logger directly and calling the csv library to write to that stream. The other csv options also has to be provided for your header. You can add additional parameter to the constructor to do that.
Disclaimer
This has not been tested in production, has to be used for learning purposes only and I recommend that you either use a third party library or adapt the code to your situation. For example, create a better exception that you can handle when the logs are in the wrong order(if you modify the code). Anyway I hope this has saved you time and feel free to comment if you want to achieve anything else and you are stuck. Thanks for reading! 🙂
Disclaimer: This tutorial has been tested on Python3.8.
If you are short on time please find below the minimal required code to turn log messages into correct JSON format in Python using the builtin logging module:
If you are still reading I am going to assume that you are interested in learning about how the above works. First of all we creating a logger object with logger = logging.getLogger(__name__). If you are not familiar with the special variable __name__, this change depending on where you are running the script. For example if you are running the above script above directly the value will be “__main__”. However, if this is being run from a module then it will take the module’s name.
Setting The Log Level
You can set different log levels to the logger object. There are 5 log levels in the following Hierarchy:
DEBUG
INFO
WARNING
ERROR
CRITICAL
It is a hierarchy because if you set the log level let’s say at INFO, then INFO, WARNING, ERROR and CRITICAL only work i.e DEBUG messages are ignored. If you are accessing a logger and you are not sure which log levels it can log you can always call logger.isEnabledFor(logging.INFO) to check if a particular level is enabled (of course replace logging.INFO with any other level that you want to check).
Anyway we set the log level to DEBUG in this example: logger.setLevel(logging.DEBUG)
Handlers
Handlers are used to serve messages to destinations. In this case, the StreamHandler is used to send messages (like you guessed) to streams. This includes the console. there are 14 handlers which come inbuilt you can find more information on them here: https://docs.python.org/3/howto/logging.html#useful-handlers
As an alternative you can change the handler to a FileHandler to save to disk.
Formatter
In the code above we create a class JSONFormatter which inherits from class logging.Formatter. Then we override the format function of the class to write our own format class. Here, we have made use of the json library to make sure we are converting dictionaries correctly to json. Feel free to use any other formatting that you like. The message passed in .info here is stored record.msg. If you would like to find all the attributes that are stored in record you can call record.__dict__ You should something like the following:
This gives you a few useful information that you can make use of based on your needs. In this case we have only used msg.
There is a bit more to this than what I have covered and I would recommend you to check out the official documentation and inspect the code of the module. You could also comment below if you would like to see any other logging article.
This is a followup to building a basic honeypot. In this post we analysing attacks that we have collected over one week, eh wait actually over only one day. It so happens that there were around 100,000 payload attack collected in one day and I think this is enough data for now to analyse (manually).
Note that the payloads below are only the first payload sent before the attacks actually begin. Also there were many attacks to discover open ports which I have ignored/not worth documenting.
Microsoft Remote Desktop Vulnerability
The large majority of attacks (a whopping 75%) sustained by the honeypot were attacks to detect vulnerable Microsoft Remote Desktop servers. The payload was as follows:
This one is likely a SMB Relay attack being conducted to eventually try to gain access to a windows machine by exploiting a design flaw(at that time). The payload looks like the following:
This was a simple GET request to attempt to access .env file which is a configuration used by frameworks like Docker. This contains information like API keys or even passwords.
Simply an accessible admin page of Apache Solr. This should not be publicly accessible for obvious reasons.
'GET /solr/admin/info/system?wt=json HTTP/1.1\r\nHost: 128.199.193.172:8983\r\nUser-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36\r\nAccept-Encoding: gzip\r\nConnection: close\r\n\r\n' from ('45.155.205.225', 55070)
CVE-2019-7238 RCE vulnerability in Sonatype Nexus Repository Manager
Based on the reference below this is a vulnerability present Sonatype Nexus Repository Manager installations prior to version 3.15.0. The payload attack is as follows:
'POST /service/extdirect HTTP/1.1\r\nHost: 128.199.193.172:8081\r\nUser-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36\r\nContent-Length: 293\r\nContent-Type: application/json\r\nAccept-Encoding: gzip\r\nConnection: close\r\n\r\n{"action":"coreui_Component","data":[{"filter":[{"property":"repositoryName","value":"*"},{"property":"expression","value":"1==1"},{"property":"type","value":"jexl"}],"limit":50,"page":1,"sort":[{"direction":"ASC","property":"name"}],"start":0}],"method":"previewAssets","tid":18,"type":"rpc"}'
Although I am not sure if this attack was actually to find vulnerable Ethereum nodes, there are Ethereum nodes vulnerable to json-rpc attacks. The payload attack is as follows:
As I go through the attacks collected I am realising that the list is too long and sadly won’t be able to go through all of them manually. In the coming days I will try to think of a better way to identify and categorise them.
Anyway I hope that this was informative and that this post has motivated you to take the necessary precautions. The web is a scary place!
RPC failed; curl 55 OpenSSL SSL_write: SSL_ERROR_ZERO_RETURN, errno 10053. fatal: the remote end hung up unexpectedly
One of the reasons for this to happen is that you might have a large file inside your commit that is failing to be pushed eventually hanging with “fatal: the remote end hung up unexpectedly”.
If you can confirm that the issue is indeed with a large file then you should use git lfs but note that git lfs will intercept large files tracked and upload it on its own server:
git-lfs has a lot of dependencies( go, libyaml, [email protected], readline and ruby) so it might take a while to install. You can also install from binary by running the ./install.sh.
Once installed you can track all large files for example in the docs:
git lfs track "*.psd"
Your tracked files’ details are saved inside a .gitattributes so make sure to add .gitattributes to persist tracking when other users clone the project:
git add .gitattributes
That’s it! You should then be able to safely add, commit and push!
By default the large files will be kept on disk locally and not be pushed remotely i.e. only the pointers are pushed. You can push it to the endpoint (git lfs env to find the endpoint) e.g. github by using the following command.
git lfs push --all <remote e.g. origin>
If you are still having trouble feel free to comment however I don’t guarantee that I have the answer. Git lfs is best used with Bitbucket read more here: https://www.atlassian.com/git/tutorials/git-lfs
The internet is a dangerous place. There are malicious bots online, going through all IP Addresses (bruteforcing by pruning known IP Ranges) to find any vulnerable servers to attack. In this post we are creating a HoneyPot in Python with asyncio in order to bait them into attacking us to analyse their attacks.
The bots do this by sending specific payloads (requests with specific data) and checking if the expected response was received. In this post we will attempt to create a simple honeypot which listens to all ports from 2 to 65535 (which is the maximum port number). Port 0 and Port 1 will not be used by the HoneyPot.
Server – DigitalOcean
The HoneyPot will run on a DigitalOcean VM because some of their IP Ranges are known by attackers ref: https://ipinfo.io/AS14061. Port 1 will actually be used for SSH by us because we want to check attacks on port 22 (default SSH port). Attack against weak passwords should be common (note this is an assumption).
Quick Python 3.8 Setup
As mentioned by the title you will need to install Python3.8. If you are running Ubuntu 16.04 you can do the following:
Note that we have increased the limit on open files to 100,000. This is because every socket opened on Linux is an open file. You should only need 65535 but 100,000 just to be safe (for example logging is going to use one at least).
You should then also modify /etc/ssh/sshd_config and modify the port to 1. When you ssh on the client side then you can do for example:
Asyncio is used to execute multiple tasks at a time on a single thread. We are going to make use of coroutines (async def) which listen (asyncio.start_server) a specific port and log any requests received (file logger to honeypot.log). We then launch all the tasks at runtime(asyncio.gather). There are 3 parameters that can be passed through the command line (done with Fire): address, port_start and port_end.
Please feel free to comment if you are having any trouble running it.
Next Step: Honeypot.log Payload Analysis
I have planned to collect sample attacks in the honeypot.log and to do another blog post to analyse each of them. Initially I had planned to run it for a whole week but to my surprise the server is receiving hundreds of attacks per minute already.
For example one common attack that I can notice right away is the following:
Received b'\x03\x00\x00+&\xe0\x00\x00\x00\x00\x00Cookie: mstshash=hello\r\n\x01\x00\x08\x00\x03\x00\x00\x00' from ('13.65.92.2', 59134)
Cookie: mstshash=hello is payload to access compromised Microsoft Remote Desktop servers.
The usual way to recruit someone is to ask for a curriculum vitae and motivation letter. If you have been on the interviewer side like me you will realise that there are many candidates who only upload the same curriculum vitae and motivation letter as well, perhaps with a few changes.
This should be considered as spam really. In small teams where there is only one person interviewing, for example in a small company where it could the manager or team leader looking at the CVs in his/her spare time, it could become really challenging to go through all the CVs and decreases the bandwidth of time that be spent on real and deserving candidates.
Instead of just asking for a CV and motivation (and additional links for software related positions: like github link or website with portfolio) the job boards (may be) or the career page of the company’s website should have an additional layer of filter.
First Layer: Technical Question Multiple Choice
The first layer should something that can be easily automated and prune some of the unqualified candidates. This can be done with a multiple choice questionnaire with a time limit either for every answer of for the entire questionnaire. This forces the candidates to really possess the basic knowledge to apply to the job.
Ideally the questions should also be randomised every time the test is taken.
Second Layer: Question With Use Cases
Most people lie in their CVs and motivation letters. We need to make sure that they actually have the advertised skills. In this layer we should ask use cases and complicated questions related to the work that they will carry. For example if they are going to work as a Data Engineer then may be ask how they will architect around data pipelines, data collection and data processing to provide real past examples.
This also filters out people who are just spamming, for example “Easy Apply” on LinkedIn.
This step cannot be automated and should probably be timed but will at least give you a better idea of the candidate.
Actually we can almost entirely eliminate the use of CVs and Motivation letters with this step.
Do you know every chess player only wants one thing: to become a better player? Obviously, they are all different; they all have their temperament and own learning speeds. However, if you are one of those and want to improve your chess skills as a stronger player, then you have come to the right place!
This article will help you go through this process, irrespective of your chess level. The basic principle is that you practice each of the following steps during your preparation and use them.
The majority of the ways to improve is found on chess.com and we will be referencing it tons below.
So let’s start now!
Learn Chess Tactics
For the regular player, learning or improving chess can yield better results more easily than any other player. You can either win or lose a piece after a planned combination. In other words, variations of short-term combinations lead to material gains or a draw, for example, pins, forks, skewers, checkmates, etc.
The same type of chess situation comes up again and again in several tactical themes. You only need to learn these spots when it’s your time to move and understand the moves of your opponent. During the game, it will give you great confidence. Spotting the move will save your time or give you a chance to have the upper hand over your opponent.
A handy trick again on chess.com is to practice Puzzles. Puzzles is a fast way to learn get muscle memory on how to handle different situations with the most optimal move. Hikaru himself mentioned that this is one of the best ways to develop intuition!
Vision
Learning how to quickly spot different tiles based on the tile numbers like a1, h2 etc will help you greatly when reviewing videos, livestreams or even reading on chess.
Tunnel Vision
During matches you will have the tendency to focus on only part of the board or part of the strategy for example only offensive moves or only defensive moves. You need to learn to take a step back and reevaluate your weaknesses and opportunities at every move.
Videos
Look at livestreams at https://www.twitch.tv/Chess they are usually commented by top level chess players. Their commentary will help you immensely with learning how to think and how to play. This is a kind of tutoring that is available free from the best.
Another advice on videos is to look for top chess players training Twitch streams for the Pogchamp tournament. They are coached on everything from openings to best practices to how to checkmate. An example video:
Learn How to Control the Center
The four squares in the middle of the board should be used properly. This is the crucial part of your tactics for the rest of the board, so you can move your pieces wherever you want. Please do not leave your king for a longer period in the board’s center; it exposes the all-important piece to attacks.
Endgame is crucial for winning.Â
From my training at my beginner level, I’ve learned that I also need to spend more time on chess endings. There’s no reason for losing a match that you can easily win! The endgame is fun, full of twists that every player needs to know. It is crucial for winning the game.
Opening Basics
It is impossible to know all the openings, so basic learning of what opening you are in will greatly help you. It will help you to consider what types of steps and moves you should use. You need to study them closely.
Playing and Analyzing, Repeat the Cycle
This rule is very simple. The more you play, the more understanding you can have about the game. One part is about learning a new chess concept, and then the second part is the excellent execution of it in a game.
The third part is analysis. After any move played by your opponent, you can have a gap and ask this set of questions to analyze the step:
What is the point of this move?
Does it require any urgent action?
Is it a good or bad move, and why?
You can also go through the same set of questions, even though the moves seem to be clear. To sum up, (1) plan, (2) play, (3) evaluate, (4) play again, and repeat the process over and over again.
On chess.com you will be given the option to analysze the whole match with an AI which will evaluate and suggest the best moves at different points in time during the match.
Lastly, Be Motivated
After a certain level, the loss of passion, is one of the key challenges most players face. Motivation is the most critical component for progress in chess. You need to find the right inspiration to help you keep going.
Finally
You can always improve your game in chess. Getting better at chess can be both fun and simple with the right habits and mindset. You will only need to learn the rules, play a lot of games, analyze the game, study the endgame, not waste time on openings, and carefully play your moves to improve at chess.
Choosing to invest in stocks is similar to buying a piece of a company. Below is a checklist that I personally go through before investing in a company. This is based on 3 years only on investing but hopefully it will be helpful to you.
Choose a trust-worthy broker. Ideally go to the bank that you currently have your account at and ask them if they have a department to help you buy stocks from the popular exchanges. The commission rates will usually be on the first purchase only.
Invest over a long period of time and do not invest all your money at the same time. Investing smaller amounts over long period of times provides you with an additional layer of safety against crashes like the recent Covid-19 related crash. My recommendation is to invest at least over 5 years.
Never Short any stock. This is one potential way you can be losing more money than what you have invested.
Don’t use leverage. If ever you decide to use leverage please use a take profit and a stop loss with a ratio of 3:1 and make sure that the stop loss risks only 1% of your total account.
Know the domain in which you are investing. For example if you decide to invest in Cloudflare: make sure you know about information technology and that you understand the value that they are providing to their customers.
Check the company’s financials for the past 10 years if available. Ideally you want a company that is trending in positive profit from the last few years and with low debt. For example https://seekingalpha.com/symbol/NET/income-statement shows that the company is growing.
Checking insider trading. Are people working at the company selling their stocks? This can be a red flag.
Apart from seekingalpha, I also suggest taking a look at https://simplywall.st/ to find more useful information. (not sponsored it’s just useful)
Reinvest dividends back into your account. This will have a cumulative effect on the growth of your portfolio.
Disclaimer
Note this guide is not incentivising to invest in Cloudflare. This is only being used as an example. Ideally you should only invest in stocks an amount that you are comfortable to spare.
Disclaimer: This post is not sponsored by Metadefender, Cloudinary or AWS S3 but is recommended as one simple solution.
File upload attacks occur in 2 stages. The first stage is to attempt to upload file which contains malicious code and the second stage is to find a way to execute malicious code from that file.
The risks associated with unrestricted file upload and eventual remote code execution range from anything between simple defacement of a website to a complete system takeover.
Tips To Protect Against File Upload Attacks
You have to deny file listing , patch vulnerabilities and combinations of vulnerabilities in file listing that can allow the attacker to execute the uploaded payload.
Find vulnerabilities in libraries that are handling your uploaded files
Add more safeguards like checking content-type and file extension.
Filter out special characters from filenames and extension.
Follow the best security practices for your web server. For example for Microsoft IIS
You need to make sure only authenticated users are allowed to upload files
You need to ensure that you have a malware scanner or antivirus examine not only the file but also the file system and network for suspicious activity.
You need to randomize the filename once they are uploaded to make it harder to be executed.
Disable verbose errors on the web user interface and instead give simple errors when a file upload fails.
There are many ways in which you can be attacked and prevent the attacks. Keeping updated and providing securing file uploads can easily take up a massive amount of time for you as a developer or website owner. That is why you could alternatively use a third party service like Metadefender, AWS S3 and Cloudinary.
Of course, there are costs associated with using third party services but in reality it will take only one attack to cause more damage than the costs being paid to the third party services.
The developers working at those companies are continuously working on patching security vulnerabilities while you will be able to focus on improving your business.